Movatterモバイル変換


[0]ホーム

URL:


US20210357138A1 - Optimal placement of data structures in a hybrid memory based inference computing platform - Google Patents

Optimal placement of data structures in a hybrid memory based inference computing platform
Download PDF

Info

Publication number
US20210357138A1
US20210357138A1US15/929,618US202015929618AUS2021357138A1US 20210357138 A1US20210357138 A1US 20210357138A1US 202015929618 AUS202015929618 AUS 202015929618AUS 2021357138 A1US2021357138 A1US 2021357138A1
Authority
US
United States
Prior art keywords
memory
data structures
activations
weights
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/929,618
Other versions
US11175844B1 (en
Inventor
Ashish Ranjan
Arvind Kumar
Carl Radens
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines CorpfiledCriticalInternational Business Machines Corp
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATIONreassignmentINTERNATIONAL BUSINESS MACHINES CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: KUMAR, ARVIND, RADENS, CARL, RANJAN, ASHISH
Priority to US15/929,618priorityCriticalpatent/US11175844B1/en
Priority to GB2218461.8Aprioritypatent/GB2610975A/en
Priority to PCT/CN2021/087267prioritypatent/WO2021227757A1/en
Priority to DE112021001597.4Tprioritypatent/DE112021001597T5/en
Priority to JP2022564490Aprioritypatent/JP7609537B2/en
Priority to CN202180032075.6Aprioritypatent/CN115516435A/en
Publication of US11175844B1publicationCriticalpatent/US11175844B1/en
Application grantedgrantedCritical
Publication of US20210357138A1publicationCriticalpatent/US20210357138A1/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

In a deep neural network (DNN), weights are defined that represent a strength of connections between different neurons of the DNN and activations are defined that represent an output produced by a neuron after passing through an activation function of receiving an input and producing an output based on some threshold value. The weight traffic associated with a hybrid memory therefore is distinguished from the activation traffic to the hybrid memory, and one or more data structures may be dynamically allocated in the hybrid memory according to the weights and activations of the or more data structures in the DNN. The hybrid memory includes at least a first memory and a second memory that differ according to write endurance attributes.

Description

Claims (20)

8. A system for optimized placement of data structures in memory in a computing environment, comprising:
one or more computers with executable instructions that when executed cause the system to:
distinguish, by a memory controller, between weights and activations of one or more data structures in a deep neural network (DNN) using flags attached to the one or more data structures, the flags having a first value indicative of the weights and a second value indicative of the activations; and
dynamically allocate and route the one or more data structures in a hybrid memory according to the flags indicative of the weights and activations of the one or more data structures in the DNN, wherein the hybrid memory includes at least a first memory and a second memory that differ according to one or more write attributes.
15. A computer program product for optimized placement of data structures in memory by a processor in a computing environment, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
an executable portion that distinguishes, by a memory controller, between weights and activations of one or more data structures in a deep neural network (DNN) using flags attached to the one or more data structures, the flags having a first value indicative of the weights and a second value indicative of the activations; and
an executable portion that dynamically allocates and routes the one or more data structures in a hybrid memory according to the flags indicative of the weights and activations of the one or more data structures in the DNN, wherein the hybrid memory includes at least a first memory and a second memory that differ according to one or more write attributes.
US15/929,6182020-05-132020-05-13Optimal placement of data structures in a hybrid memory based inference computing platformActiveUS11175844B1 (en)

Priority Applications (6)

Application NumberPriority DateFiling DateTitle
US15/929,618US11175844B1 (en)2020-05-132020-05-13Optimal placement of data structures in a hybrid memory based inference computing platform
JP2022564490AJP7609537B2 (en)2020-05-132021-04-14 Optimal allocation method and system for hybrid memory-based data structure
PCT/CN2021/087267WO2021227757A1 (en)2020-05-132021-04-14Optimal placement of data structures in a hybrid memory based inference computing platform
DE112021001597.4TDE112021001597T5 (en)2020-05-132021-04-14 OPTIMAL PLACEMENT OF DATA STRUCTURES IN AN INFERENCE DATA PROCESSING PLATFORM BASED ON A HYBRID MEMORY
GB2218461.8AGB2610975A (en)2020-05-132021-04-14Optimal placement of data structures in a hybrid memory based inference computing platform
CN202180032075.6ACN115516435A (en)2020-05-132021-04-14Optimized arrangement of data structures in hybrid memory-based inferential computing platforms

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US15/929,618US11175844B1 (en)2020-05-132020-05-13Optimal placement of data structures in a hybrid memory based inference computing platform

Publications (2)

Publication NumberPublication Date
US11175844B1 US11175844B1 (en)2021-11-16
US20210357138A1true US20210357138A1 (en)2021-11-18

Family

ID=78512425

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US15/929,618ActiveUS11175844B1 (en)2020-05-132020-05-13Optimal placement of data structures in a hybrid memory based inference computing platform

Country Status (6)

CountryLink
US (1)US11175844B1 (en)
JP (1)JP7609537B2 (en)
CN (1)CN115516435A (en)
DE (1)DE112021001597T5 (en)
GB (1)GB2610975A (en)
WO (1)WO2021227757A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20220318656A1 (en)*2021-03-302022-10-06EMC IP Holding Company LLCModel parameter sharing between inference application instances in processing unit of information processing system
CN114637466B (en)*2022-03-032022-11-11深圳大学 A data reading and writing behavior inference method, device, storage medium and electronic device

Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20160086078A1 (en)*2014-09-222016-03-24Zhengping JiObject recognition with reduced neural network weight precision
US20180018560A1 (en)*2016-07-142018-01-18Manuel SALDANASystems, methods and devices for data quantization
US20180082181A1 (en)*2016-05-132018-03-22Samsung Electronics, Co. Ltd.Neural Network Reordering, Weight Compression, and Processing
US20190244106A1 (en)*2018-02-082019-08-08Western Digitial Technologies, Inc.Convolution engines for systolic neural network processor
US20190286972A1 (en)*2018-03-142019-09-19Microsoft Technology Licensing, LlcHardware accelerated neural network subgraphs
US20190303743A1 (en)*2016-08-132019-10-03Intel CorporationApparatuses, methods, and systems for neural networks
US20190378001A1 (en)*2018-06-122019-12-12Samsung Electronics Co., Ltd.Neural network hardware acceleration with stochastic adaptive resource allocation
US20200210840A1 (en)*2018-12-312020-07-02Microsoft Technology Licensing, LlcAdjusting precision and topology parameters for neural network training based on a performance metric
US20200226453A1 (en)*2020-03-272020-07-16Intel CorporationMethods and apparatus for dynamic batching of data for neural network workloads
US20200285950A1 (en)*2017-04-042020-09-10Hailo Technologies Ltd.Structured Weight Based Sparsity In An Artificial Neural Network Compiler
US20200380306A1 (en)*2019-06-032020-12-03Wipro LimitedSystem and method for implementing neural network models on edge devices in iot networks
US20210019633A1 (en)*2019-07-152021-01-21Facebook Technologies, LlcSystem and method for shift-based information mixing across channels for shufflenet-like neural networks

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103678143B (en)2012-09-252018-10-12联想(北京)有限公司File memory method, device and electronic equipment
US9195934B1 (en)2013-01-312015-11-24Brain CorporationSpiking neuron classifier apparatus and methods using conditionally independent subsets
CN105094686B (en)2014-05-092018-04-10华为技术有限公司Data cache method, caching and computer system
US9830086B2 (en)2016-03-032017-11-28Samsung Electronics Co., Ltd.Hybrid memory controller for arbitrating access to volatile and non-volatile memories in a hybrid memory group
US20170322740A1 (en)2016-05-092017-11-09Microsoft Technology Licensing, LlcSelective data persistence in computing systems
US10230045B2 (en)2017-04-212019-03-12Gyrfalcon Technology Inc.Process of fabricating embedded spin transfer torque memory for cellular neural network based processing unit
TWI653584B (en)2017-09-152019-03-11中原大學Method of judging neural network with non-volatile memory cells
US20190164037A1 (en)2017-11-292019-05-30Electronics And Telecommunications Research InstituteApparatus for processing convolutional neural network using systolic array and method thereof
US11941719B2 (en)2018-01-232024-03-26Nvidia CorporationLearning robotic tasks using one or more neural networks
JP7349438B2 (en)2018-02-162023-09-22三星電子株式会社 neural network accelerator
US11501140B2 (en)2018-06-192022-11-15International Business Machines CorporationRuntime reconfigurable neural network processor core

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20160086078A1 (en)*2014-09-222016-03-24Zhengping JiObject recognition with reduced neural network weight precision
US20180082181A1 (en)*2016-05-132018-03-22Samsung Electronics, Co. Ltd.Neural Network Reordering, Weight Compression, and Processing
US20180018560A1 (en)*2016-07-142018-01-18Manuel SALDANASystems, methods and devices for data quantization
US20190303743A1 (en)*2016-08-132019-10-03Intel CorporationApparatuses, methods, and systems for neural networks
US20200285950A1 (en)*2017-04-042020-09-10Hailo Technologies Ltd.Structured Weight Based Sparsity In An Artificial Neural Network Compiler
US20190244106A1 (en)*2018-02-082019-08-08Western Digitial Technologies, Inc.Convolution engines for systolic neural network processor
US20190286972A1 (en)*2018-03-142019-09-19Microsoft Technology Licensing, LlcHardware accelerated neural network subgraphs
US20190378001A1 (en)*2018-06-122019-12-12Samsung Electronics Co., Ltd.Neural network hardware acceleration with stochastic adaptive resource allocation
US20200210840A1 (en)*2018-12-312020-07-02Microsoft Technology Licensing, LlcAdjusting precision and topology parameters for neural network training based on a performance metric
US20200380306A1 (en)*2019-06-032020-12-03Wipro LimitedSystem and method for implementing neural network models on edge devices in iot networks
US20210019633A1 (en)*2019-07-152021-01-21Facebook Technologies, LlcSystem and method for shift-based information mixing across channels for shufflenet-like neural networks
US20200226453A1 (en)*2020-03-272020-07-16Intel CorporationMethods and apparatus for dynamic batching of data for neural network workloads

Also Published As

Publication numberPublication date
JP7609537B2 (en)2025-01-07
WO2021227757A1 (en)2021-11-18
GB2610975A (en)2023-03-22
GB202218461D0 (en)2023-01-25
US11175844B1 (en)2021-11-16
JP2023524407A (en)2023-06-12
DE112021001597T5 (en)2023-02-09
CN115516435A (en)2022-12-23

Similar Documents

PublicationPublication DateTitle
US11748648B2 (en)Quantum pulse optimization using machine learning
US11521067B2 (en)Decentralized distributed deep learning
US11681796B2 (en)Learning input preprocessing to harden machine learning models
US11501160B2 (en)Cloud computing data compression for allreduce in deep learning
US20240193486A1 (en)Accelerated machine learning
US11790231B2 (en)Determining optimal augmentations for a training data set
US11966776B2 (en)Learning agent based application scheduling
US11551129B2 (en)Quantum platform routing of a quantum application component
US11216281B2 (en)Facilitating data processing using SIMD reduction operations across SIMD lanes
US20190325295A1 (en)Time, space, and energy efficient neural inference via parallelism and on-chip memory
US20240104418A1 (en)Graphics processing unit training job allocation
WO2021227757A1 (en)Optimal placement of data structures in a hybrid memory based inference computing platform
US11080486B2 (en)Remote neural network processing for guideline identification
JP7633759B2 (en) Cooperative Neural Networks with Spatial Confinement Constraints
WO2021180548A1 (en)Inducing creativity in an artificial neural network
WO2022227860A1 (en)Fair simultaneous comparison of parallel machine learning models
US11941111B2 (en)Exploiting fine-grained structured weight sparsity in systolic arrays
US11429524B2 (en)Optimized hierarchical scratchpads for enhanced artificial intelligence accelerator core utilization
US11526791B2 (en)Methods and systems for diverse instance generation in artificial intelligence planning
US20210216879A1 (en)Methods and systems for improving heuristic searches for artificial intelligence planning
US20200356371A1 (en)Reusing an operand in an instruction set architecture (isa)
US20230325256A1 (en)Deep neural network management of overbooking in a multi-tenant computing environment
US11288046B2 (en)Methods and systems for program optimization utilizing intelligent space exploration
US20230186130A1 (en)Quantum circuit buffering
US20230099635A1 (en)Context aware automated artificial intelligence framework

Legal Events

DateCodeTitleDescription
FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCFInformation on status: patent grant

Free format text:PATENTED CASE

MAFPMaintenance fee payment

Free format text:PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment:4


[8]ページ先頭

©2009-2025 Movatter.jp