Movatterモバイル変換


[0]ホーム

URL:


US20230419128A1 - Methods for development of a machine learning system through layered gradient boosting - Google Patents

Methods for development of a machine learning system through layered gradient boosting
Download PDF

Info

Publication number
US20230419128A1
US20230419128A1US18/301,660US202318301660AUS2023419128A1US 20230419128 A1US20230419128 A1US 20230419128A1US 202318301660 AUS202318301660 AUS 202318301660AUS 2023419128 A1US2023419128 A1US 2023419128A1
Authority
US
United States
Prior art keywords
model
decision trees
parameters
loss function
iterations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US18/301,660
Other versions
US11853906B1 (en
Inventor
Rachael McNaughton
Richard Bland
Colin Towers
William James
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Towers Watson Software Ltd
Original Assignee
Towers Watson Software Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Towers Watson Software LtdfiledCriticalTowers Watson Software Ltd
Priority to US18/301,660priorityCriticalpatent/US11853906B1/en
Priority to EP23739650.2Aprioritypatent/EP4405870A1/en
Priority to PCT/IB2023/000229prioritypatent/WO2024003609A1/en
Assigned to Towers Watson Software LimitedreassignmentTowers Watson Software LimitedASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: MCNAUGHTON, RACHAEL, BLAND, RICHARD, TOWERS, COLIN
Priority to US18/530,732prioritypatent/US20240119315A1/en
Application grantedgrantedCritical
Publication of US11853906B1publicationCriticalpatent/US11853906B1/en
Publication of US20230419128A1publicationCriticalpatent/US20230419128A1/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

A layered machine learning system for processing data. The machine learning system comprises decision trees with different depths. An iterative training process is performed on the layered machine learning system to determine the structures of the decision trees based on prior predictions. The fitted decision trees are further configured to update leaf values with a gradient boosting method. By cumulating the predictions of decisions trees in prior iterations, interaction effects are modeled among different depths within the layered machine learning system.

Description

Claims (23)

1. A system, comprising:
at least one processor;
at least one database configured to store a training dataset comprising a plurality of records, wherein each of the plurality of records includes one or more attribute variables associated with historical customer data and a response variable representing a known insurance premium outcome;
at least one memory storing instructions that, when executed by the at least one processor, cause the at least one processor to:
retrieve the training dataset from the at least one database;
convert the plurality of records in the training dataset to categorical variables in numeric representation;
select a first set of numerical parameters and a second set of numerical parameters;
select a loss function based on a probability distribution;
initialize a model having a plurality of decision trees with different depths, based on the training dataset, the first set of numerical parameters, the second set of numerical parameters, the selected loss function, and a third set of numerical parameters, to compute a plurality of model parameters;
train the model, based on the training dataset, to refine a plurality of model parameters of the plurality of decision trees through a plurality of iterations, wherein in each iteration, the instructions cause the at least one processor to:
compute a first-order derivative of the selected loss function and a second-order derivative of the selected loss function based on the training dataset, the first set of numerical parameters, and a first set of model parameters of decision trees in equal or lower depths in prior iterations;
determine splits of the plurality of decision trees based on comparison results between the second-order derivative of the selected loss function and a first element of the second set of numerical parameters;
compute a marginal parameter based on the ratio of the computed first-order derivative of the selected loss function and the computed second-order derivative of the selected loss function;
update the model parameters of the plurality of decision trees with a product of the marginal parameter and a second element of the second set of numerical parameters based on a second set of model parameters of decision trees in lower depths of all iterations and in the equal depth in prior iterations;
determine that the trained model, after training through the plurality of iterations, satisfies at least one of stopping criteria; and
store the splits and the plurality of model parameters of the plurality of decision trees within the trained model.
2. The system ofclaim 1, wherein the instructions further cause the at least one processor to:
compute a gain value of one of the decision trees based on a difference in an evaluation metric between a parent node and a sum of two child nodes of the parent node, wherein the evaluation metric is determined by the first-order derivative of the selected loss function and the second-order derivative of the selected loss function;
determine the computed gain value does not satisfy a third element of the second set of parameters; and
remove each leaf node of the one of the decision trees.
3. The system ofclaim 2, wherein the third element of the second set of parameters is a minimum split loss.
4. The system ofclaim 1, wherein the first set of numerical parameters is a plurality of weight variables for a plurality of records in the training dataset.
5. The system ofclaim 1, wherein the second set of numerical parameters is a plurality of hyperparameters including minimum child weight, learning rate, minimum split loss, number of iterations, maximum depth of the decision tree, row sampling, column sampling by tree, and column sampling by split.
6. The system ofclaim 1, wherein the first element of the second set of numerical parameters is a minimum child weight.
7. The system ofclaim 1, wherein the second element of the second set of numerical parameters is a learning rate.
8. The system ofclaim 1, wherein the probability distribution is one of Gaussian (normal) distribution, Poisson distribution, gamma distribution, Tweedie distribution, and logistic distribution.
9. The system ofclaim 1, wherein the third set of numerical parameters is a plurality of starting values including a cutoff value for a selected attribute and a predicted value in the first iteration.
10. The system ofclaim 1, wherein the stopping criteria comprises a maximum number of iterations specified in the second set of numerical parameters, a threshold value indicating no additional gain to be found in a new training iteration, and a threshold value of performance evaluation of the model based on a validation set.
11. The system ofclaim 1, wherein the model is configured to generate predictions of at least one of insurance premium policies, claim cost, claim frequency, and claim severity, based on customer input data.
12. A method, comprising:
retrieving a training dataset from at least one database;
converting a plurality of records in the training dataset to categorical variables in numeric representation, wherein each of the plurality of records includes one or more attribute variables associated with historical customer data and a response variable representing a known insurance premium outcome;
selecting a first set of numerical parameters and a second set of numerical parameters;
selecting a loss function based on a probability distribution;
initializing a model having a plurality of decision trees with different depths, based on the training dataset, the first set of numerical parameters, the second set of numerical parameters, the selected loss function, and a third set of numerical parameters, to compute a plurality of model parameters;
training the model, based on the training dataset, to refine a plurality of model parameters of the plurality of decision trees through a plurality of iterations, wherein each iteration comprises:
computing a first-order derivative of the selected loss function and a second-order derivative of the selected loss function based on the training dataset, the first set of numerical parameters, and a first set of model parameters of decision trees in equal or lower depths in prior iterations;
determining splits of the plurality of decision trees based on comparison results between the second-order derivative of the selected loss function and a first element of the second set of numerical parameters;
computing a marginal parameter based on the ratio of the computed first-order derivative of the selected loss function and the computed second-order derivative of the selected loss function;
updating the model parameters of the plurality of decision trees with a product of the marginal parameter and a second element of the second set of numerical parameters based on a second set of model parameters of decision trees in lower depths of all iterations and in the equal depth in prior iterations;
determining that the trained model, after training through the plurality of iterations, satisfies at least one of stopping criteria; and
storing the splits and the plurality of model parameters of the plurality of decision trees within the trained model.
13. The method ofclaim 12, further comprises:
computing a gain value of one of the decision trees based on a difference in an evaluation metric between a parent node and a sum of two child nodes of the parent node, wherein the evaluation metric is determined by the first-order derivative of the selected loss function and the second-order derivative of the selected loss function;
determining the computed gain value does not satisfy a third element of the second set of parameters; and
remove each leaf node of the one of the decision trees.
14. The method ofclaim 13, wherein the third element of the second set of parameters is a minimum split loss.
15. The method ofclaim 12, wherein the first set of numerical parameters is a plurality of weight variables for a plurality of records in the training dataset.
16. The method ofclaim 12, wherein the second set of numerical parameters is a plurality of hyperparameters including minimum child weight, learning rate, minimum split loss, number of iterations, maximum depth of the decision tree, row sampling, column sampling by tree, and column sampling by split.
17. The method ofclaim 12, wherein the first element of the second set of numerical parameter is a minimum child weigh.
18. The method ofclaim 12, wherein the second element of the second set of numerical parameters is a learning rate.
19. The method ofclaim 12, wherein the probability distribution is one of Gaussian (normal) distribution, Poisson distribution, gamma distribution, Tweedie distribution, and logistic distribution.
20. The method ofclaim 12, wherein the third set of numerical parameters are starting values including a cutoff value for a selected attribute and a predicted value in the first iteration.
21. The method ofclaim 12, wherein the stopping criteria comprises a maximum number of iterations specified in the second set of numerical parameters, a threshold value indicating no additional gain to be found in a new training iteration, and a threshold value of performance evaluation of the model based on a validation set.
22. The method ofclaim 12, wherein the model is configured to generate predictions of at least one of insurance premium policies, claim cost, claim frequency, and claim severity, based on customer input data.
23. A non-transitory computer-readable medium including processor-executable instructions for generating a layered machine learning model to proceed data to predict at least one of insurance premium policies, claim cost, claim frequency, and claim severity, when executed by a processor, cause the processor to perform the steps of:
retrieving a training dataset from at least one database;
converting a plurality of records in the training dataset to categorical variables in numeric representation, wherein each of the plurality of records includes one or more attribute variables associated with historical customer data and a response variable representing a known insurance premium outcome;
selecting a plurality of weight variables for the plurality of records in the training dataset and a plurality of hyperparameters including minimum child weight, learning rate, minimum split loss, number of iterations, maximum depth of the decision tree, row sampling, column sampling by tree, and column sampling by split;
selecting a loss function based on one of Gaussian (normal) distribution, Poisson distribution, gamma distribution, Tweedie distribution, and logistic distribution;
initializing a model having a plurality of decision trees with different depths, based on the training dataset, the weight variables, the plurality of hyperparameters, the selected loss function, and a plurality of starting values including a cutoff value for a selected attribute and a predicted value in the first iteration, to compute a plurality of model parameters;
training the model, based on the training dataset, to refine a plurality of model parameters of the plurality of decision trees through a plurality of iterations, wherein each iteration comprises:
computing a first-order derivative of the selected loss function and a second-order derivative of the selected loss function based on the training dataset, the weight variables, and a first set of model parameters of decision trees in equal or lower depths in prior iterations;
determining splits of the plurality of decision trees based on comparison results between the second-order derivative of the selected loss function and a minimum child weight;
computing a marginal parameter based on the ratio of the computed first-order derivative of the selected loss function and the computed second-order derivative of the selected loss function;
updating the model parameters of the plurality of decision trees with a product of the marginal parameter and a learning rate based on a second set of model parameters of decision trees in lower depths of all iterations and in the equal depth in prior iterations;
determining that the trained model, after training through the plurality of iterations, satisfies at least one of stopping criteria, including a maximum number of iterations specified in the second set of numerical parameters, a threshold value indicating no additional gain to be found in a new training iteration, and a threshold value of performance evaluation of the model based on a validation set; and
storing the splits and the plurality of model parameters of the plurality of decision trees within the trained model.
US18/301,6602022-06-272023-04-17Methods for development of a machine learning system through layered gradient boostingActiveUS11853906B1 (en)

Priority Applications (4)

Application NumberPriority DateFiling DateTitle
US18/301,660US11853906B1 (en)2022-06-272023-04-17Methods for development of a machine learning system through layered gradient boosting
EP23739650.2AEP4405870A1 (en)2022-06-272023-04-18Methods for development of a machine learning system
PCT/IB2023/000229WO2024003609A1 (en)2022-06-272023-04-18Methods for development of a machine learning system
US18/530,732US20240119315A1 (en)2022-06-272023-12-06Methods for development of a machine learning system through layered gradient boosting

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US202263356006P2022-06-272022-06-27
US18/301,660US11853906B1 (en)2022-06-272023-04-17Methods for development of a machine learning system through layered gradient boosting

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
US18/530,732ContinuationUS20240119315A1 (en)2022-06-272023-12-06Methods for development of a machine learning system through layered gradient boosting

Publications (2)

Publication NumberPublication Date
US11853906B1 US11853906B1 (en)2023-12-26
US20230419128A1true US20230419128A1 (en)2023-12-28

Family

ID=89323105

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US18/301,660ActiveUS11853906B1 (en)2022-06-272023-04-17Methods for development of a machine learning system through layered gradient boosting

Country Status (1)

CountryLink
US (1)US11853906B1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118586046B (en)*2024-08-052024-12-10和璞思德能源顾问(山东)有限公司 Green energy transaction data monitoring system based on blockchain
CN119385408A (en)*2024-11-012025-02-07南通市久正人体工学股份有限公司 A brushless push rod self-learning control method and system
CN119477382A (en)*2024-11-082025-02-18平安银行股份有限公司 Lead following method, lead following device, electronic device and storage medium
CN119167211B (en)*2024-11-202025-05-27济宁职业技术学院Mandarin level tester management method and system
CN119474888B (en)*2025-01-152025-06-24齐鲁工业大学(山东省科学院)Turbulence-influencing temperature and salt depth data identification method of portable underwater glider
CN119537899B (en)*2025-01-212025-04-18北京南天智联信息科技股份有限公司Data identification method and system based on supervised learning
CN119740990B (en)*2025-02-282025-07-18广东美电贝尔科技集团股份有限公司 A comprehensive evaluation method of duty effectiveness based on artificial intelligence

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20220083906A1 (en)*2020-09-162022-03-17International Business Machines CorporationFederated learning technique for applied machine learning
US20220164877A1 (en)*2020-11-242022-05-26Zestfinance, Inc.Systems and methods for generating gradient-boosted models with improved fairness

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11514526B1 (en)2016-05-172022-11-29Liberty Mutual Insurance CompanySystems and methods for property damage restoration predictions based upon processed digital images
US10977737B2 (en)2018-01-102021-04-13Liberty Mutual Insurance CompanyTraining gradient boosted decision trees with progressive maximum depth for parsimony and interpretability
CN108345702A (en)2018-04-102018-07-31北京百度网讯科技有限公司Entity recommends method and apparatus
CN108932480B (en)2018-06-082022-03-15电子科技大学Distributed optical fiber sensing signal feature learning and classifying method based on 1D-CNN
US10510002B1 (en)2019-02-142019-12-17Capital One Services, LlcStochastic gradient boosting for deep neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20220083906A1 (en)*2020-09-162022-03-17International Business Machines CorporationFederated learning technique for applied machine learning
US20220164877A1 (en)*2020-11-242022-05-26Zestfinance, Inc.Systems and methods for generating gradient-boosted models with improved fairness

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Wen et al., Efficient Gradient Boosted Decision Tree Training on GPUs, 2018 IEEE International Parallel and Distributed Processing Symposium, pp. 234-243 (Year: 2018)*

Also Published As

Publication numberPublication date
US11853906B1 (en)2023-12-26

Similar Documents

PublicationPublication DateTitle
US11853906B1 (en)Methods for development of a machine learning system through layered gradient boosting
US20240112209A1 (en)Systems and methods for providing machine learning model disparate impact information
US20250200667A1 (en)Technology for building and managing data models
CN110704730B (en)Product data pushing method and system based on big data and computer equipment
Huber et al.Should I stay or should I go? A latent threshold approach to large‐scale mixture innovation models
US20230105547A1 (en)Machine learning model fairness and explainability
CN106095942B (en)Strong variable extracting method and device
US8032473B2 (en)Generalized reduced error logistic regression method
CN110995459A (en)Abnormal object identification method, device, medium and electronic equipment
US9280740B1 (en)Transforming predictive models
Mustika et al.Analysis accuracy of xgboost model for multiclass classification-a case study of applicant level risk prediction for life insurance
US20090177612A1 (en)Method and Apparatus for Analyzing Data to Provide Decision Making Information
US8984022B1 (en)Automating growth and evaluation of segmentation trees
KR102519878B1 (en)Apparatus, method and recording medium storing commands for providing artificial-intelligence-based risk management solution in credit exposure business of financial institution
PriyaLinear regression algorithm in machine learning through MATLAB
Stødle et al.Data‐driven predictive modeling in risk assessment: Challenges and directions for proper uncertainty representation
KR102409041B1 (en)portfolio asset allocation reinforcement learning method using actor critic model
Serengil et al.A comparative study of machine learning approaches for non performing loan prediction with explainability
StricklandPredictive modeling and analytics
US20150088789A1 (en)Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, supply amount prediction device, supply amount prediction method, and recording medium
SembinaBuilding a scoring model using the adaboost ensemble model
CN119398929A (en) Multi-factor analysis system, method and medium applicable to stock market
ZhouExplainable ai in request-for-quote
Pertiwi et al.Combination of stacking with genetic algorithm feature selection to improve default prediction in P2P lending
US20240119315A1 (en)Methods for development of a machine learning system through layered gradient boosting

Legal Events

DateCodeTitleDescription
FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

ASAssignment

Owner name:TOWERS WATSON SOFTWARE LIMITED, UNITED KINGDOM

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MCNAUGHTON, RACHAEL;BLAND, RICHARD;TOWERS, COLIN;SIGNING DATES FROM 20230904 TO 20230905;REEL/FRAME:065497/0597

STCFInformation on status: patent grant

Free format text:PATENTED CASE


[8]ページ先頭

©2009-2025 Movatter.jp