Movatterモバイル変換


[0]ホーム

URL:


US20180253818A1 - Deep learning via dynamic root solvers - Google Patents

Deep learning via dynamic root solvers
Download PDF

Info

Publication number
US20180253818A1
US20180253818A1US15/906,044US201815906044AUS2018253818A1US 20180253818 A1US20180253818 A1US 20180253818A1US 201815906044 AUS201815906044 AUS 201815906044AUS 2018253818 A1US2018253818 A1US 2018253818A1
Authority
US
United States
Prior art keywords
gpus
gpu
initial
current
computer system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/906,044
Inventor
Anto Ajay Raj John
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines CorpfiledCriticalInternational Business Machines Corp
Priority to US15/906,044priorityCriticalpatent/US20180253818A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATIONreassignmentINTERNATIONAL BUSINESS MACHINES CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: JOHN, ANTO AJAY RAJ
Publication of US20180253818A1publicationCriticalpatent/US20180253818A1/en
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

The present invention provides a computer implemented method, system, and computer program product of deep learning via dynamic root solvers. In an embodiment, the present invention includes (1) forming an initial set of GPUs into an initial binary tree architecture, where the initial set includes initially idle GPUs and an initial root solver GPU as the root of the initial binary tree architecture, (2) calculating initial gradients and initial adjusted weight data, (3) choosing a first currently idle GPU as a current root solver GPU, (4) forming a current set of GPUs into a current binary tree architecture, where the current set includes the additional currently idle GPUs and the current root solver GPU as the root of the current binary tree architecture, (5) calculating current gradients and current adjusted weight data, and (6) transmitting an initial update to the weight data to the available GPUs.

Description

Claims (1)

What is claimed is:
1. A computer implemented method comprising:
identifying, by a host computer processor, graphic processor units (GPUs) that are available (available GPUs);
identifying, by the host computer processor, GPUs that are idle (initially idle GPUs) among the available GPUs for an initial iteration of deep learning,
wherein the identifying GPUs that are idle among the available GPUs comprises executing, by the host computer processor, a run command from a central processing unit (CPU) of each of the available GPUs to determine a percentage of the each of the available GPUs being utilized;
choosing, by the host computer processor, one of the initially idle GPUs as an initial root solver GPU for the initial iteration;
initializing, by the host computer processor, weight data for an initial set of multidimensional data;
transmitting, by the host computer processor, the initial set of multidimensional data to the available GPUs;
forming, by the host computer processor, an initial set of GPUs into an initial binary tree architecture, wherein the initial set comprises the initially idle GPUs and the initial root solver GPU, wherein the initial root solver GPU is the root of the initial binary tree architecture,
wherein the forming the initial set of GPUs into the initial binary tree architecture comprises logically connecting, by the host computer processor, a first GPU among the initially idle GPUs as a leaf node to a second GPU among the initially idle GPUs as a parent node if a fast communication link exists between the first GPU and the second GPU,
wherein the fast communication link comprises a peer-to-peer connection;
calculating, by the initial set of GPUs, initial gradients and a set of initial adjusted weight data with respect to the weight data and the initial set of multidimensional data via the initial binary tree architecture;
in response to the calculating the initial gradients and the initial adjusted weight data, identifying, by the host computer processor, a first GPU among the available GPUs to become idle (first currently idle GPU) for a current iteration of deep learning;
choosing, by the host computer processor, the first currently idle GPU as a current root solver GPU for the current iteration;
transmitting, by the host computer processor, a current set of multidimensional data to the current root solver GPU;
in response to the identifying the first currently idle GPU, identifying, by the host computer processor, additional GPUs that are currently idle (additional currently idle GPUs) among the available GPUs;
transmitting, by the host computer processor, the current set of multidimensional data to the additional currently idle GPUs;
forming, by the host computer processor, a current set of GPUs into a current binary tree architecture, wherein the current set comprises the additional currently idle GPUs and the current root solver GPU, wherein the current root solver GPU is the root of the current binary tree architecture;
calculating, by the current set of GPUs, current gradients and a set of current adjusted weight data with respect to at least the weight data and the current set of multidimensional data via the current binary tree architecture;
in response to the initial root solver GPU receiving a set of calculated initial adjusted weight data, transmitting, by the initial root solver GPU, an initial update to the weight data to the available GPUs;
in response to the current root solver GPU receiving a set of current initial adjusted weight data, transmitting, by the current root solver GPU, a current update to the weight data to the available GPUs; and
repeating the identifying, the choosing, the transmitting, the forming, and the calculating with respect to the weight data, updates to the weight data, and subsequent sets of multidimensional data, transmitting an initial update to the weight data to the available GPUs.
US15/906,0442017-03-032018-02-27Deep learning via dynamic root solversAbandonedUS20180253818A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US15/906,044US20180253818A1 (en)2017-03-032018-02-27Deep learning via dynamic root solvers

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US15/448,637US10210594B2 (en)2017-03-032017-03-03Deep learning via dynamic root solvers
US15/906,044US20180253818A1 (en)2017-03-032018-02-27Deep learning via dynamic root solvers

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US15/448,637ContinuationUS10210594B2 (en)2017-03-032017-03-03Deep learning via dynamic root solvers

Publications (1)

Publication NumberPublication Date
US20180253818A1true US20180253818A1 (en)2018-09-06

Family

ID=63355124

Family Applications (3)

Application NumberTitlePriority DateFiling Date
US15/448,637Expired - Fee RelatedUS10210594B2 (en)2017-03-032017-03-03Deep learning via dynamic root solvers
US15/857,765Expired - Fee RelatedUS10169084B2 (en)2017-03-032017-12-29Deep learning via dynamic root solvers
US15/906,044AbandonedUS20180253818A1 (en)2017-03-032018-02-27Deep learning via dynamic root solvers

Family Applications Before (2)

Application NumberTitlePriority DateFiling Date
US15/448,637Expired - Fee RelatedUS10210594B2 (en)2017-03-032017-03-03Deep learning via dynamic root solvers
US15/857,765Expired - Fee RelatedUS10169084B2 (en)2017-03-032017-12-29Deep learning via dynamic root solvers

Country Status (1)

CountryLink
US (3)US10210594B2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110889439A (en)*2019-11-082020-03-17浪潮电子信息产业股份有限公司 An image feature extraction method, device, electronic device and storage medium
WO2020109891A1 (en)*2018-11-302020-06-04International Business Machines CorporationDecentralized distributed deep learning
US10732983B1 (en)*2019-05-022020-08-04Capital One Services, LlcSystems and methods of parallel and distributed processing of datasets for model approximation
CN111966175A (en)*2020-08-142020-11-20苏州浪潮智能科技有限公司 A kind of GPU fixing structure and installation method thereof
CN114363985A (en)*2022-01-102022-04-15黑龙江大学 A method for constructing a binary tree based on node weights and a method for updating the binary tree
US20220215843A1 (en)*2021-01-042022-07-07Kwai Inc.Systems and methods for automatic speech recognition based on graphics processing units
US20220262953A1 (en)*2019-08-082022-08-18Semiconductor Energy Laboratory Co., Ltd.Semiconductor device
US11687778B2 (en)2020-01-062023-06-27The Research Foundation For The State University Of New YorkFakecatcher: detection of synthetic portrait videos using biological signals
US11922314B1 (en)*2018-11-302024-03-05Ansys, Inc.Systems and methods for building dynamic reduced order physical models
CN117725348A (en)*2024-02-072024-03-19蓝象智联(杭州)科技有限公司Thread management method and system in GPU computing large-scale array summation process
US20250217920A1 (en)*2023-12-282025-07-03Dell Products L.P.Method, device, and product for gpu cluster

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10776982B2 (en)2017-07-032020-09-15Artomatix Ltd.Systems and methods for providing non-parametric texture synthesis of arbitrary shape and/or material data in a unified framework
US10878482B2 (en)2018-01-192020-12-29Hypernet Labs, Inc.Decentralized recommendations using distributed average consensus
US11244243B2 (en)*2018-01-192022-02-08Hypernet Labs, Inc.Coordinated learning using distributed average consensus
US11526759B2 (en)*2018-11-052022-12-13International Business Machines CorporationLarge model support in deep learning
CN110543362B (en)*2019-07-312022-10-21北京奇艺世纪科技有限公司Graphics processor management method and device and server
US12361266B2 (en)2020-05-142025-07-15Samsung Electronics Co., Ltd.Hierarchical weight preprocessing for neural network accelerator
CN114896076B (en)*2022-07-152022-10-28广州启智信息科技有限公司Resource allocation control method, system and device for graphics processor cluster
CN119557109B (en)*2025-01-242025-05-02浙江华和万润信息科技有限公司AI talent culture resource allocation method, system, terminal and storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6751640B1 (en)2000-11-202004-06-15Intel CorporationMethod and apparatus for multiply-accumulate two-dimensional separable symmetric filtering
US8086755B2 (en)2004-11-292011-12-27Egenera, Inc.Distributed multicast system and method in a network
US9250665B2 (en)2012-06-072016-02-02Apple Inc.GPU with dynamic performance adjustment
US9741098B2 (en)2012-10-122017-08-22Nvidia CorporationSystem and method for optimizing image quality in a digital camera
KR102124395B1 (en)2013-08-122020-06-18삼성전자주식회사Graphics processing apparatus and method thereof
CN104036451B (en)2014-06-202018-12-11深圳市腾讯计算机系统有限公司Model method for parallel processing and device based on multi-graphics processor
CN104035751B (en)2014-06-202016-10-12深圳市腾讯计算机系统有限公司Data parallel processing method based on multi-graphics processor and device
US20160342887A1 (en)2015-05-212016-11-24minds.ai inc.Scalable neural network system
US10002402B2 (en)2015-07-232018-06-19Sony CorporationLearning convolution neural networks on heterogeneous CPU-GPU platform
CN105227669A (en)2015-10-152016-01-06浪潮(北京)电子信息产业有限公司A kind of aggregated structure system of CPU and the GPU mixing towards degree of depth study
US10481661B2 (en)2016-03-302019-11-19Intel CorporationPower supply interface light load signal

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11521067B2 (en)2018-11-302022-12-06International Business Machines CorporationDecentralized distributed deep learning
WO2020109891A1 (en)*2018-11-302020-06-04International Business Machines CorporationDecentralized distributed deep learning
US12229683B2 (en)*2018-11-302025-02-18Ansys, Inc.Systems and methods for building dynamic reduced order physical models
GB2593070A (en)*2018-11-302021-09-15IbmDecentralized distributed deep learning
US20240193423A1 (en)*2018-11-302024-06-13Ansys, Inc.Systems and methods for building dynamic reduced order physical models
US11922314B1 (en)*2018-11-302024-03-05Ansys, Inc.Systems and methods for building dynamic reduced order physical models
US10732983B1 (en)*2019-05-022020-08-04Capital One Services, LlcSystems and methods of parallel and distributed processing of datasets for model approximation
US11385901B2 (en)2019-05-022022-07-12Capital One Services, LlcSystems and methods of parallel and distributed processing of datasets for model approximation
US20220262953A1 (en)*2019-08-082022-08-18Semiconductor Energy Laboratory Co., Ltd.Semiconductor device
US11908947B2 (en)*2019-08-082024-02-20Semiconductor Energy Laboratory Co., Ltd.Semiconductor device
CN110889439A (en)*2019-11-082020-03-17浪潮电子信息产业股份有限公司 An image feature extraction method, device, electronic device and storage medium
US11687778B2 (en)2020-01-062023-06-27The Research Foundation For The State University Of New YorkFakecatcher: detection of synthetic portrait videos using biological signals
US12106216B2 (en)2020-01-062024-10-01The Research Foundation For The State University Of New YorkFakecatcher: detection of synthetic portrait videos using biological signals
CN111966175A (en)*2020-08-142020-11-20苏州浪潮智能科技有限公司 A kind of GPU fixing structure and installation method thereof
US11741967B2 (en)*2021-01-042023-08-29Kwai Inc.Systems and methods for automatic speech recognition based on graphics processing units
US20220215843A1 (en)*2021-01-042022-07-07Kwai Inc.Systems and methods for automatic speech recognition based on graphics processing units
CN114363985A (en)*2022-01-102022-04-15黑龙江大学 A method for constructing a binary tree based on node weights and a method for updating the binary tree
US20250217920A1 (en)*2023-12-282025-07-03Dell Products L.P.Method, device, and product for gpu cluster
CN117725348A (en)*2024-02-072024-03-19蓝象智联(杭州)科技有限公司Thread management method and system in GPU computing large-scale array summation process

Also Published As

Publication numberPublication date
US10210594B2 (en)2019-02-19
US20180253816A1 (en)2018-09-06
US20180253817A1 (en)2018-09-06
US10169084B2 (en)2019-01-01

Similar Documents

PublicationPublication DateTitle
US10169084B2 (en)Deep learning via dynamic root solvers
Dally et al.Evolution of the graphics processing unit (GPU)
US12430702B2 (en)Learning robotic tasks using one or more neural networks
US11995767B2 (en)Apparatus and method for compressing ray tracing acceleration structure build data
US20230410375A1 (en)Temporally stable data reconstruction with an external recurrent neural network
US12354010B2 (en)Gradient compression for distributed training
JP7428315B2 (en) System, device, method and program for efficient distributed denoising of graphics frames
US11615602B2 (en)Appearance-driven automatic three-dimensional modeling
CN114365185A (en) Generate images using one or more neural networks
KR20220062575A (en) Video upsampling using one or more neural networks
CN113449859A (en)Data processing method and device
JP2021149932A (en)Apparatus and method for asynchronous ray tracing
US11282258B1 (en)Adaptive sampling at a target sampling rate
US20220398283A1 (en)Method for fast and better tree search for reinforcement learning
US11610370B2 (en)Joint shape and appearance optimization through topology sampling
US20230298127A1 (en)Apparatus and method for biased bvh traversal path
CN115701613A (en) Multiresolution hash coding for neural networks
CN113762461A (en) Training Neural Networks with Limited Data Using Reversible Boosting Operators
EP4246448A1 (en)Apparatus and method for acceleration data structure re-braiding with camera position
JP2021149934A (en)Apparatus and method for performing stable short delay sorting operation
US20240404174A1 (en)Neural head avatar construction from an image
CN115202922A (en) Packed Error Correcting Code (ECC) for Compressed Data Protection
US20230298243A1 (en)3d digital avatar generation from a single or few portrait images
US20240233238A1 (en)Apparatus and method for intra-bvh level-of-detail selection
US11568621B2 (en)Dynamic character model fitting of three-dimensional digital items

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JOHN, ANTO AJAY RAJ;REEL/FRAME:045454/0580

Effective date:20170302

STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO PAY ISSUE FEE


[8]ページ先頭

©2009-2025 Movatter.jp