Movatterモバイル変換


[0]ホーム

URL:


CN114219611A - Loan amount calculation method and device, computer equipment and storage medium - Google Patents

Loan amount calculation method and device, computer equipment and storage medium
Download PDF

Info

Publication number
CN114219611A
CN114219611ACN202111375611.0ACN202111375611ACN114219611ACN 114219611 ACN114219611 ACN 114219611ACN 202111375611 ACN202111375611 ACN 202111375611ACN 114219611 ACN114219611 ACN 114219611A
Authority
CN
China
Prior art keywords
risk
data
coefficient
module
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111375611.0A
Other languages
Chinese (zh)
Inventor
刘垚
范戈
曾桂平
许晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank CorpfiledCriticalChina Construction Bank Corp
Priority to CN202111375611.0ApriorityCriticalpatent/CN114219611A/en
Publication of CN114219611ApublicationCriticalpatent/CN114219611A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The disclosure relates to a loan amount calculation method, apparatus, computer device, and storage medium. The method comprises the following steps: calculating the difference value of the income amount data and the liability amount data to obtain the basic amount data of the object; calculating to obtain a risk coefficient by adopting a linear programming through a pre-trained default probability model, a preset fractional span, a basic total amount obtained by calculation according to basic amount data, bad amount data determined according to an object with overdue loan, a preset risk constraint condition; determining the hierarchy of an object of each dimension according to a preset dimension, and determining the layering coefficient of the object based on the hierarchy of the object of each dimension; and calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient. By adopting the method, the limit of the personal loan can be accurately determined, so that the personal loan default risk is reduced, and the overall income of the personal loan is increased.

Description

Loan amount calculation method and device, computer equipment and storage medium
Technical Field
The disclosure relates to the technical field of big data, in particular to a loan amount calculation method, a loan amount calculation device, computer equipment and a storage medium.
Background
With the development of internet financial technology, the consumption concept and the consumption level of people are continuously changed and improved, the proportion of the personal credit loan service is continuously increased, and the personal credit loan service has the characteristics of more strokes, small amount of money and rich data, so that corresponding products for automatically approving the personal loan in real time appear, but most of the products for automatically approving the personal loan in real time mainly depend on higher interest rate to make up the loss caused by the failure of payment of the user, the effective management and control of loan risks are not realized, the traditional scoring card method is used, only the basic data is considered to evaluate the risk of the personal loan of the client, therefore, the limit of the personal loan is roughly determined, the accurate assessment of the risk of the client cannot be realized, the limit of the personal loan is accurately determined, the risk of personal loan default is greatly improved, and the overall income of the personal loan is reduced.
Disclosure of Invention
In view of the above, there is a need to provide a loan amount calculation method, device, computer device and storage medium, which can accurately determine the amount of a personal loan, thereby reducing the risk of personal loan default and increasing the overall income of the personal loan.
A loan amount calculation method comprises the following steps:
calculating the difference value between the income limit data and the liability limit data to obtain the basic limit data of the object, wherein the income limit data is determined according to the financial information of the object, and the liability limit data is determined according to the credit information of the object;
calculating to obtain a risk coefficient by adopting a linear programming through a pre-trained default probability model, a preset fractional span, a basic total amount obtained by calculation according to basic amount data, bad amount data determined according to an object with overdue loan, a preset risk constraint condition;
determining the hierarchy of an object of each dimension according to a preset dimension, and determining the layering coefficient of the object based on the hierarchy of the object of each dimension;
and calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient.
In one embodiment, the method for training the default probability model includes:
analyzing and processing target variables determined through account age analysis to obtain characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue;
performing model fitting on the characteristic variables by using a logistic regression algorithm, and performing model evaluation on a logistic regression model obtained by model fitting;
and under the condition that the evaluation index is not lower than a first preset value and the stability index is not higher than a second preset value in the model evaluation, the logistic regression model for carrying out the model evaluation is a default probability model.
In one embodiment, analyzing and processing the target variable determined by account age analysis to obtain a characteristic variable includes:
acquiring information data of an object for establishing a model, determining a target variable of the information data through account age analysis, and acquiring modeling data in the target variable, wherein the modeling data comprises owned data and third-party data acquired after the object is authorized;
performing descriptive statistics on modeling data;
carrying out data processing on the modeling data subjected to descriptive statistics to obtain characteristic variables, wherein the data processing comprises the following steps: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
In one embodiment, the feature screening according to the information values and the correlation coefficients of the variables derived by the features includes:
calculating an information value of the modeling data;
deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value;
calculating correlation coefficients of variables derived through the features and modeling data;
and obtaining the modeling data with the largest information value in the modeling data with the correlation coefficient larger than the threshold value of the correlation coefficient.
In one embodiment, the method for obtaining the risk coefficient by using the linear programming calculation through the pre-trained default probability model, the pre-set fractional span, the basic total amount calculated according to the basic amount data, the bad amount data determined according to the object with overdue loan, the preset risk constraint condition and the linear programming calculation comprises the following steps:
calculating the default probability of the object by using a pre-trained default probability model;
determining a score of the object based on the probability and a conversion coefficient obtained by the fractional span calculation;
determining a risk level of the subject according to the score;
and calculating basic total amount and bad amount data in the risk grade, and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the grade.
In one embodiment, the step of calculating the basic total amount and bad amount data in the risk level and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the level includes:
calculating the sum of basic quota data of the object in each risk level to obtain basic total quota data of each level;
calculating the sum of basic limit data of objects with overdue loan in each risk level to obtain bad limit data of each level;
calculating to obtain a risk adjustment coefficient of each risk grade according to the basic total limit data and the bad limit data of each risk grade and a preset risk constraint condition;
calculating the reject ratio of the total quota according to the basic total quota data, the bad quota data and the risk adjustment coefficient of each risk grade;
and under the condition that the total amount reject ratio is minimum and the risk constraint condition is met, the corresponding risk adjustment coefficient is the risk coefficient corresponding to the risk grade.
In one embodiment, the preset risk constraints include:
arranging risk coefficients according to the risk grades, and gradually reducing the risk coefficients;
the difference between the adjacent risk coefficients is greater than or equal to a first preset difference;
ranking the first and second risk coefficients greater than a first threshold, the risk coefficients outside the first and second rankings being less than the first threshold;
and calculating the quota ratio of each risk level according to the risk adjustment coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
In one embodiment, the quota ratio of each risk level is calculated by adopting the following formula through the risk adjustment coefficient of each risk level and the basic total quota data:
Figure BDA0003363879650000041
the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade are calculated by adopting the following formula to obtain the total amount bad rate:
Figure BDA0003363879650000042
wherein, biBad credit data for the ith risk class, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
In one embodiment, the preset dimensions include: unit dimension, payroll horizontal dimension, unit zone dimension, generation month dimension, credit history length dimension, and history repayment performance dimension.
In a second aspect, the present disclosure also provides a loan amount calculation apparatus, including:
the income amount determining module is used for determining income amount data according to the financial information of the object;
the liability credit limit determining module is used for determining liability credit limit data according to credit information of the object;
the basic limit calculation module is used for calculating the difference value of the income limit data and the liability limit data to obtain the basic limit data of the object;
the risk coefficient calculation module is used for calculating a risk coefficient through a pre-trained default probability model, a preset conversion coefficient, a basic total amount obtained through calculation according to basic amount data, bad amount data determined according to an overdue object of the loan, a preset risk constraint condition and linear programming;
the hierarchical coefficient determining module is used for determining the hierarchy of the object of each dimension according to the preset dimension and determining the hierarchical coefficient of the object based on the hierarchy of the object of each dimension;
and the loan amount calculation module is used for calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient.
In one embodiment of the apparatus, the apparatus further comprises: the device comprises an analysis processing module, a model evaluation module and a model determination module;
and the analysis processing module is used for analyzing and processing the target variables determined through the account age analysis to obtain the characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue.
And the model evaluation module is used for performing model fitting on the characteristic variables by using a logistic regression algorithm and performing model evaluation on the logistic regression model obtained by model fitting.
And the model determining module is used for determining the logistic regression model for model evaluation as the default probability model under the condition that the evaluation index is not lower than a first preset value and the stability index is not higher than a second preset value in the model evaluation performed by the model evaluation module.
In one embodiment of the apparatus, the analysis processing module includes: the system comprises a target variable determining module, a descriptive counting module and a data processing module;
and the target variable determining module is used for acquiring information data of the object for establishing the model, determining the target variable of the information data through account age analysis, and acquiring modeling data in the target variable, wherein the modeling data comprises owned data and third-party data acquired after the object is authorized.
And the descriptive statistic module is used for performing descriptive statistics on the modeling data.
The data processing module is used for carrying out data processing on the modeling data subjected to the descriptive statistics to obtain characteristic variables, and the data processing comprises the following steps: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
In one embodiment of the apparatus, the data processing module comprises: the device comprises an information value calculating module, an information value deleting module, a correlation coefficient calculating module and an obtaining module;
and the information value calculating module is used for calculating the information value of the modeling data.
And the information value deleting module is used for deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value.
And the correlation coefficient calculation module is used for calculating the correlation coefficient of the variable derived from the characteristics and the modeling data.
And the acquisition module is used for acquiring the modeling data with the largest information value in the modeling data with the correlation coefficient larger than the correlation coefficient threshold value.
In one embodiment of the apparatus, the risk factor calculation module comprises: the system comprises a default probability calculation module, a score determination module, a risk grade matching module and a risk coefficient calculation module;
and the default probability calculation module is used for calculating the default probability of the object by utilizing a pre-trained default probability model.
And the score determining module is used for determining the score of the object based on the probability and the conversion coefficient obtained by the fractional span calculation.
And the risk grade matching module is used for determining the risk grade of the object according to the score.
And the risk coefficient calculation module is used for calculating the basic total amount and bad amount data in the risk level and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the level.
In one embodiment of the apparatus, the risk factor calculation module comprises: the system comprises a basic total amount data calculation module, a bad amount data calculation module, a risk adjustment coefficient calculation module, a total amount bad rate calculation module and a risk coefficient determination module;
and the basic total amount data calculation module is used for calculating the sum of the basic amount data of the object in each risk level to obtain the basic total amount data of each level.
And the bad credit line data calculation module is used for calculating the sum of basic credit line data of objects with overdue loan in each risk level to obtain the bad credit line data of each level.
And the risk adjustment coefficient calculation module is used for calculating the risk adjustment coefficient of each risk grade according to the basic total amount data and the bad amount data of each risk grade and the preset risk constraint condition.
And the total amount reject ratio calculation module is used for calculating the total amount reject ratio through the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade.
And the risk coefficient determining module is used for determining the corresponding risk adjustment coefficient as the risk coefficient corresponding to the risk grade under the condition that the total limit reject ratio is minimum and the risk constraint condition is met.
In one embodiment of the apparatus, the risk factor calculating module further includes: a risk constraint condition setting module for setting risk constraint conditions, wherein the risk constraint conditions comprise: arranging risk coefficients according to the risk grades, and gradually reducing the risk coefficients; the difference between the adjacent risk coefficients is greater than or equal to a first preset difference; ranking the first and second risk coefficients greater than a first threshold, the risk coefficients outside the first and second rankings being less than the first threshold; and calculating the quota ratio of each risk level according to the risk adjustment coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
In one embodiment of the apparatus, the risk constraint setting module comprises: the quota ratio calculation module is used for calculating quota ratio of each risk level by adopting the following formula through the risk adjustment coefficient of each risk level and the basic total quota data:
Figure BDA0003363879650000061
wherein, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
In one embodiment of the apparatus, the total credit rejection rate calculating module is further configured to calculate the total credit rejection rate by using the following formula:
Figure BDA0003363879650000071
wherein, biBad credit data for the ith risk class, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
In one embodiment of the apparatus, the preset dimension in the hierarchical coefficient determining module includes: unit dimension, payroll horizontal dimension, unit zone dimension, generation month dimension, credit history length dimension, and history repayment performance dimension.
In a third aspect, the present disclosure also provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.
In a fifth aspect, the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the above method.
The loan limit calculation method, the loan limit calculation device, the loan limit calculation computer equipment and the storage medium combine financial data and credit investigation data to determine income limit and liability limit, further determine basic limit data, adopt risk coefficients calculated by default probability models and based on preset fractional spans, basic total limit calculated according to the basic limit data, bad limit data determined according to objects whose loans are overdue, preset risk constraint conditions and risk coefficients of the objects calculated by a linear programming method, effectively reduce the bad rate of the total limit, determine layering coefficients through preset dimensions, further adjust credit lines by adopting the layering coefficients, effectively improve the credit lines of high-quality objects, finally determine the credit lines of the objects through the layering coefficients, the risk coefficients and the basic limit data, and can realize accurate assessment of client risks, the method has the advantages that the limit of the personal loan is accurately determined, the personal loan default risk is reduced, and the overall income of the personal loan is improved.
On the other hand, the characteristic variables can be obtained by carrying out data analysis and data processing on the target variables used by the training model, wherein the characteristic variables comprise the characteristics of multiple dimensions and are meaningful characteristics for model training, and the default probability model is trained by using the characteristic variables during model training, so that the default probability model obtained after training can accurately obtain default probability.
Drawings
FIG. 1 is a schematic flow chart illustrating a method for calculating a loan amount according to an embodiment;
FIG. 2 is a flowchart illustrating the steps of a method for training a default probability model in one embodiment;
FIG. 3 is a flowchart illustrating step S202 according to an embodiment;
FIG. 4 is a schematic flow chart illustrating the feature filtering step of step S306 in one embodiment;
FIG. 5 is a flowchart illustrating the step S104 according to an embodiment;
FIG. 6 is a flowchart illustrating step S508 according to an embodiment;
FIG. 7 is a flowchart illustrating the step S106 according to an embodiment;
FIG. 8 is a block diagram schematically showing the construction of a loan amount calculation apparatus according to an embodiment;
FIG. 9 is a diagram showing an internal configuration of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clearly understood, the present disclosure is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not intended to limit the disclosure.
In the embodiments herein, the term "and/or" is only one kind of association relation describing an associated object, and means that there may be three kinds of relations. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be noted that the terms "first," "second," and the like in the description and claims herein and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments herein described are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or device.
In recent years, with the continuous transition and increase of consumption level and consumption concept, the proportion of personal credit is increasing, but at present, the following problems exist in determining the credit line of personal credit:
(1) data and information are not comprehensive, most of the existing credit measurement methods only consider unilateral most basic data, such as existing assets, income, liabilities and the like. If only the most basic data of one side is considered, the credit is excessively granted, so that the personal loan default risk is greatly improved.
(2) The existing limit measuring and calculating method uses the traditional grading card method, only considers the risk coefficient of the client, does not subdivide according to the specific situation of the client, and lacks the limit measuring and calculating model subdivided by the client, which can cause inaccurate limit measuring and calculating and further cause greatly improved personal loan default risk.
Therefore, in view of the above problems, the present disclosure provides a loan amount calculation method, which is exemplified by applying the method to a terminal, and it is understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server, where the terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, and tablet servers, and may be implemented by an independent server or a server cluster consisting of a plurality of servers. In this embodiment, as shown in fig. 1, the method includes the following steps:
s102, calculating a difference value between income limit data and liability limit data to obtain basic limit data of the object, wherein the income limit data is determined according to financial information of the object, and the liability limit data is determined according to credit information of the object.
The income amount data can be the income amount of the object determined according to multi-dimensional data in the financial information. The liability amount data may be liability data of the subject determined from the credit amount and the credit card amount in the credit information. The basic quota data may represent the most basic quota available for use by the subject. The financial information may be information of various expenses and incomes of the subject acquired through a financial institution. The credit investigation information can be collected, arranged and stored by a personal credit database established by a specific organization, provides credit report inquiry service for commercial banks and individuals, and provides personal credit information used by related information service for other purposes of currency policy making, financial supervision and legal and regulatory provisions. An object may generally refer to a person or a personal business that is capable of making loans.
Specifically, by acquiring financial information of a subject in a pedestrian, a bank supervisor or a union pay, the financial information may include multi-dimensional data such as annual income, annual accrual payroll, public accumulation fund, running water amount and the like; the multi-dimensional data is used for fitting personalized scenes corresponding to different objects, different products and different scene characteristics can be considered, and the multi-dimensional data has advantages compared with single-dimensional data. And calculating to obtain income amount data according to the financial information and corresponding coefficients preset by technicians in the field. And determining the liability amount data through the information such as the total supply amount of the existing consumption loan month and the average using amount of the credit card amount month in the acquired credit information of the object. The liability limit data is helpful for further examining the current liability information of the subject and incorporating the information of other financial institutions, so that excessive credit is avoided and the risk is further reduced. In some embodiments, the liability credit data may be calculated by: the data of the credit line is the total supply of the current consumption loan month and the monthly usage amount of the credit card line. And calculating the difference value between the income amount data and the liability amount data, and obtaining the basic amount data of the object according to the difference value.
And S104, calculating to obtain a risk coefficient by using a pre-trained default probability model, a pre-set fractional span, a basic total amount obtained by calculation according to the basic amount data, bad amount data determined according to the object with overdue loan, a preset risk constraint condition and linear programming.
The default probability model trained in advance can be a model obtained by training based on methods such as feature engineering and machine learning. Feature engineering may be a process of performing a series of engineering processes on raw data to refine it into features that can be used as input for algorithms and/or models, and is a process of representing and presenting data. In actual practice, feature engineering aims to remove impurities and redundancies in the raw data. Machine learning is a multi-domain interdisciplinary of artificial intelligence, and the main research object in the field is artificial intelligence, particularly how to improve the performance of a specific algorithm in empirical learning. Currently, popular machine learning algorithms include gradient boosting trees (GBDT, LGBM, etc.), linear regression, naive bayes, random forests, ensemble models, etc. The skilled person can select the corresponding machine learning algorithm according to the actual situation.
The fractional span may be data that can calculate a conversion coefficient so that the hierarchical level of the object is calculated by the conversion coefficient. The bad amount data may be basic amount data of an object whose loan is overdue. The risk constraints are usually referred to as construction conditions of the linear programming problem. The risk factor may be a factor for adjusting the credit limit of the subject, and may be a factor for minimizing the limit failure rate.
Specifically, default probability is calculated through a pre-trained default probability model, conversion coefficients are calculated through fractional spans, the default probability is converted into corresponding scores through the conversion coefficients, risk levels of objects are determined according to the corresponding scores, basic total amount obtained through calculation according to the basic amount data, bad amount data determined according to the objects with overdue loans and preset risk constraint conditions are calculated, and risk coefficients under each risk level are obtained.
S106, determining the object level of each dimension according to the preset dimension, and determining the layering coefficient of the object based on the object level of each dimension.
The preset dimension may be a dimension determined according to a region, a payroll level, loan information, and the like. The tier factor may generally be a factor that can reduce the amount of credit risk.
Specifically, the hierarchy of the object of each dimension is determined according to the unit dimension, the payroll horizontal dimension, the unit area dimension, the generation month dimension, the credit history length dimension and the history repayment expression dimension, and the hierarchical coefficient of the object is determined according to the hierarchy of the object of each dimension. Wherein, those skilled in the art can determine the layering coefficient of the object by the preset determination rule. The person skilled in the art can also determine the layering coefficients of the objects through preset dimensions according to self experience.
And S108, calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient.
Specifically, the loan amount of the subject is the basic amount data × the risk coefficient × the tier coefficient.
In the loan amount calculation method, income amount and liability amount are determined by combining financial data and credit investigation data, basic amount data is further determined, risk coefficients obtained by calculation of default probability models are adopted, basic total amount obtained by calculation of the basic amount data is calculated based on preset fractional span, bad amount data determined by the object whose loan is overdue and preset risk constraint conditions are used for calculating the risk coefficient of a client by linear programming, the total amount reject ratio is effectively reduced, the credit line of a high-quality object is effectively improved by adopting a layering coefficient to adjust the credit line, and finally the credit line of the object is determined by the layering coefficient, the risk coefficient and the basic amount data, so that accurate assessment of client risks can be realized, the amount of personal loan is accurately determined, and personal loan default risks are reduced, and improving the overall income of the personal loan.
In one embodiment, as shown in fig. 2, the method for training the default probability model includes:
and S202, analyzing and processing target variables determined through account age analysis to obtain characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue.
The account age analysis can be a view analysis which is widely applied to the financial credit industry, and the analysis method is to track credit accounts generated in different periods respectively and perform synchronous comparison according to the account age so as to know the quality conditions of assets approved by the accounts in different periods. The analysis process may generally be a way of processing the target variable, which may include both data analysis and data processing. Feature variables are typically data whose raw data attributes are transformed into features by processing, which can be used to train the model.
Specifically, the characteristic variables are obtained by performing data analysis and data processing on the target variables determined after the view analysis. The target variables may include objects for which the loan is overdue beyond a first time (which may be generally defined as bad customers) and objects for which the loan is not overdue (which may be generally defined as good customers). It should be noted that, in this embodiment, the first time is a preset time, and a person skilled in the art may select a setting according to a specific scenario, and the setting may be 30 days, 60 days, and the like, which is not limited in this embodiment. The first time of this embodiment is typically 30 days.
And S204, performing model fitting on the characteristic variables by using a logistic regression algorithm, and performing model evaluation on a logistic regression model obtained by model fitting.
Among them, logistic regression is also called logistic regression analysis, which is a generalized linear regression analysis model. Model fitting may be the process of supervised learning, which may be the knowledge of the relationship between input and output results from an existing data set (which may be a training set). Based on this known relationship, an optimal model is trained. By finding the relation between the characteristics and the labels, when only the characteristics have no labeled data, the labels of the data are judged more accurately.
Specifically, the obtained feature variables are divided into a training set and a verification set by performing data division, the division mode may be random division or setting a time point, the feature variables before the time point are a test set, and the feature variables after the time point are a verification set. The feature variables before the time point may be a verification set, and the feature variables after the time point may be a test set. The random division may be a division of the training set and the validation set according to a preset ratio or other means. After the classification into a training set and a verification set, model fitting is carried out through the training set pair and by using a logistic regression algorithm, namely the model training process, and a logistic regression model is obtained after fitting. And performing model evaluation on the obtained logistic regression model.
S206, under the condition that the evaluation index is not lower than a first preset value and the stability index is not higher than a second preset value in model evaluation, the logistic regression model for model evaluation is a default probability model.
Model evaluation may generally include, among other things: and evaluating the accuracy of the model and evaluating the stability of the model. The evaluation index may be generally a KS value, which enables an accuracy evaluation of the model, and the KS value is an evaluation index used in the model for distinguishing the degree of separation of the predictive positive and negative samples. The Stability indicator may typically be a psi (Stability index) value, which enables an assessment of the Stability of the model.
Specifically, a KS value and a PSI value of the logistic regression model are calculated. In the model evaluation, when the KS value is not lower than the first preset data (where the first preset value may be 0.35), and the PSI value is not higher than the second preset value (where the second preset value may be 0.1), the logistic regression model for the model evaluation is the default probability model.
The default probability model may be:
Figure BDA0003363879650000131
where p is the probability of violation, x is the characteristic variable, and w is the correlation coefficient of the characteristic variable.
And if the KS value of the model to be evaluated is higher than the first preset data and/or the stability index is lower than the second preset value, fitting the model through the test set again until the evaluation index is not lower than the first preset value and the stability index is not higher than the second preset value.
In another embodiment, the default probability model may also use credit scoring cards to predict the default probability. Credit rating cards may be a means of measuring risk by credit rating, enabling prediction of overdue for a future period of time. The model principle is a generalized linear model Of binary variables, which is carried out by discretizing the variable WOE (weight Of event) coding mode and then applying logistic regression.
In this embodiment, the target variables are subjected to data analysis and data processing, so that the feature variables can be obtained, wherein the feature variables include features of multiple dimensions and are features meaningful for model training, the default probability model is trained by using the feature variables during model training, and the default probability model can accurately obtain the default probability.
In one of the first embodiments, as shown in fig. 3, the analyzing and processing the target variable determined by account age analysis to obtain the characteristic variable includes:
s302, acquiring information data of an object for establishing a model, determining a target variable of the information data through account age analysis, and acquiring modeling data in the target variable, wherein the modeling data comprises owned data and third-party data acquired after the object is authorized;
specifically, the historical loan objects are found from a database of the financial institution, and objects meeting the conditions are screened from the historical loan objects according to preset conditions to serve as objects for establishing the model. Determining customer standards, namely target variables, of the object of the model through the view analysis in the object of the model, and specifically comprising the following steps: and when the clients whose loan is overdue for more than 30 days are bad clients, the clients whose loan is overdue for 0 to 30 days are defined as good clients, and the clients whose loan is overdue for 0 to 30 days need to temporarily delete the clients from the object of the model establishment and define the clients as uncertain clients. And obtaining modeling data in the target variable, wherein the modeling data can comprise self-owned data and third-party data obtained after the object is authorized. The owned data may include loan information for the subject, such as the date of the loan, the date of the repayment, etc. The third-party data obtained after the object authorization can include credit investigation information, financial information and the like.
And S304, performing descriptive statistics on the modeling data.
Descriptive statistics generally refers to the activities of characterizing data using tabulations and classifications, graphs, and computing generalized data. Descriptive statistical analysis is to statistically describe the data about all variables of the survey population, and mainly includes frequency analysis, central tendency analysis, discrete degree analysis, distribution, some basic statistical graphs and the like of the data.
Specifically, the modeling data is subjected to descriptive statistical evaluation of the distribution of each variable value in the modeling data, identification of extreme values, and the like.
S306, performing data processing on the modeling data subjected to descriptive statistics to obtain characteristic variables, wherein the data processing comprises the following steps: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
The evidence weight conversion can be WOE (weight Of event) conversion, which is the influence Of the argument taking a certain value on the target variable. The size of the information value determines the degree of influence of the independent variable on the target variable. The WOE and the information value are used for measuring the prediction capability of the variable, and the larger the value is, the stronger the prediction capability of the variable is.
Specifically, repeated value deletion, abnormal value processing, missing value processing, data standardization, feature derivation, variable binning, evidence weight conversion and feature screening are performed on modeling data subjected to descriptive statistics according to information values and correlation coefficients of variables derived through features.
Wherein, the feature derivation may include cross-comparison (deriving information related thereto according to the provided address information of the object), i.e. an operation of deriving information related thereto according to the information in the target variable.
The variable binning may use an optimal binning method based on a tree model, or may use other optimal binning methods such as chi-square, and the binning method is not limited in this embodiment.
In one embodiment, as shown in fig. 4, the performing feature screening according to the information values and the correlation coefficients of the variables derived by the features includes:
s402, calculating an information value of the modeling data;
s404, deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value;
s406, calculating correlation coefficients of variables derived through characteristics and the modeling data;
s408, obtaining the modeling data with the largest information value in the modeling data with the correlation coefficient larger than the correlation coefficient threshold value.
In particular, by
Figure BDA0003363879650000151
The WOE value is calculated, and then by the formula:
Figure BDA0003363879650000152
and calculating to obtain the information value of the modeling data.
And deleting the modeling data of which the information value is smaller than the first information threshold or larger than the second information threshold to obtain the first modeling data, wherein in some embodiments, the first information threshold can be 0.02, and the second information threshold can be 0.5. And calculating the correlation coefficient of the variable derived from the modeling variable and the characteristic through a correlation coefficient calculation formula. And obtaining modeling data with the correlation coefficient larger than the threshold value of the correlation coefficient, recording the modeling data as second modeling data, and obtaining the modeling data with the largest information value in the second modeling data. And combining the modeling data with the largest information value in the second modeling data with the first modeling data to obtain final modeling data, namely the characteristic variables.
In the embodiment, the modeling variables with better prediction capability can be obtained by screening the modeling variables through the information values and the correlation coefficients, and the accuracy of the default probability model in calculating the default probability is improved.
In one embodiment, as shown in fig. 5, the obtaining of the risk coefficient by using a trained default probability model, a preset fractional span, a basic total amount calculated according to the basic amount data, bad amount data determined according to the object whose loan is overdue, a preset risk constraint condition, and linear programming calculation includes:
s502, calculating the default probability of the object by utilizing the pre-trained default probability model.
S504, determining the score of the object based on the probability and a conversion coefficient obtained through fractional span calculation.
S506, determining the risk level of the subject according to the score.
And S508, calculating the basic total amount and the bad amount data in the risk grade, and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the grade.
Specifically, after the default probability is calculated by using a pre-trained default probability model, the conversion coefficient is calculated based on the preset fractional span and by the following formula:
Figure BDA0003363879650000161
where a and B are calculated transformation coefficients, and PDO and p0 are preset specific point scores, both of which are constants, in some embodiments, PDO may be 20 and p0 may be 600.
After the conversion coefficient is calculated, the score of the corresponding object is determined by the following formula:
Figure BDA0003363879650000162
after the object score is calculated, the object is graded according to the preset grading standard and the object score, so that the risk level of the object is determined. The preset grading criteria may include:
(1) in each risk level, the upper and lower boundaries between the acquired regions are positive numbers.
(2) Can be divided into 5 risk levels, the risk levels from high to low account for 20%, 25%, 30%, 20% and 5%, respectively.
(3) The risk levels increase from high to low overdue rates.
(4) The risk level is increased from high to low in sequence, wherein the calculation formula of the lifting rate can be as follows:
Figure BDA0003363879650000163
in some embodiments, a specific risk rating table is shown in table 1:
TABLE 1 Risk ratings table
Figure BDA0003363879650000171
It should be noted that the data in table 1 are only examples, the cumulative number of overdue clients of the E risk level may be the number of overdue clients of the E risk level, the cumulative number of overdue clients of the D risk level may be the sum of the number of overdue clients of the E risk level and the number of overdue clients of the D risk level, and the cumulative number of overdue clients of each level is obtained by analogy. The calculation method of accumulating all clients may refer to the accumulated overdue clients, and will not be described herein repeatedly.
In one embodiment, as shown in fig. 6, calculating the basic total amount and the bad amount data in the risk level and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the level includes:
s602, calculating the sum of the basic quota data of the object in each risk level to obtain the basic total quota data of each level.
And S604, calculating the sum of basic limit data of objects with overdue loans in each risk level to obtain bad limit data of each level.
And S606, calculating to obtain a risk adjustment coefficient of each risk level according to the basic total amount data and the bad amount data of each risk level and a preset risk constraint condition.
S608, calculating the bad rate of the total amount according to the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk level.
S610, when the total amount reject ratio is minimum and the risk constraint condition is satisfied, the corresponding risk adjustment coefficient is the risk coefficient corresponding to the risk grade.
The linear programming is an important branch of operational research, and is widely applied to aspects of military operations, economic analysis, operation management, engineering technology and the like. Provides scientific basis for making optimal decision by reasonably utilizing limited resources such as manpower, material resources, financial resources and the like.
Specifically, basic quota data of the objects of each level are obtained, the basic quota data of each object are added to obtain corresponding basic total quota data in the level, objects with overdue loans in each level are obtained, and basic quota data of all objects with overdue loans in the level are added to obtain corresponding bad quota data of the level. And calculating according to the basic total amount data and the bad amount data which are calculated by each level and the preset risk constraint conditions of linear programming to obtain the risk adjustment coefficient corresponding to each risk level. At this time, only the risk adjustment coefficient is roughly calculated, but a specific condition has not been satisfied. And calculating the total limit reject ratio according to the basic total limit data, the bad limit data and the risk adjustment coefficient of each risk grade, and adjusting the risk adjustment coefficient to ensure that the total limit reject ratio is minimum.
The defective rate before the risk coefficient, which is the sum of the defective credit line data for each rank/the sum of the basic total credit line data for each rank, may be calculated. The credit limit reject rate before the risk coefficient adjustment is compared with the total credit limit reject rate to obtain the total credit limit reject rate, and the credit limit reject rate before the risk coefficient adjustment is higher than the total credit limit reject rate, so that the whole credit limit reject rate can be reduced through the risk coefficient.
In this embodiment, the risk adjustment coefficient is adjusted by calculating the risk adjustment coefficient of each risk level, the risk adjustment coefficient corresponding to the minimum total credit rate is the risk level, the purpose of solving the risk coefficient is to reduce the credit of bad customers with high default probability in order to promote the credit of customers with lower default probability, so that the total credit rate is reduced, and the overdue risk is controlled.
In one embodiment, the preset risk constraints include:
arranging the risk coefficients according to the risk grades, and gradually reducing the risk coefficients;
the difference between the adjacent risk coefficients is greater than or equal to a first preset difference;
ranking the first and second risk coefficients above a first threshold, the risk coefficients outside the first and second ranking being below a first threshold;
and calculating the quota ratio of each risk level according to the risk coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
Specifically, in some embodiments, for example, the risk coefficients ranked according to risk level A, B, C, D, E are X1, X2, X3, X4, and X5, respectively, then the preset risk constraint may be:
(1) the risk factors X1 to X5 are gradually decreased.
(2) The risk coefficients X1 to X5 are different from each other by 0.1 or more, such as the difference between X1 and X2 is 0.1 or more, and the difference between X2 and X3 is 0.1 or more.
(3) The risk coefficients X1 and X2 are greater than 1, and the risk grade coefficients X3, X4, X5 are less than or equal to 1.
(4) And calculating the quota ratio of each risk grade through the risk coefficient of each grade and the basic total quota data, wherein the sum of the quota ratios of each grade is one hundred percent.
In this embodiment, the risk factor with the risk level of A, B and a low default probability is adjusted to be greater than 1, so that the purpose of increasing the risk limit after the factor adjustment is achieved, and the risk factor with the risk level of C, D, E and a high default probability is adjusted to be less than 1, so that the risk limit can be reduced.
In one embodiment, the quota ratio of each risk level is calculated by adopting the risk adjustment coefficient of each risk level and the basic total quota data according to the following formula:
Figure BDA0003363879650000191
the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade are calculated by adopting the following formula to obtain the total amount bad rate:
Figure BDA0003363879650000192
wherein, biBad credit data for the ith risk class, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
Specifically, the total credit bad rate may be a sum of a value obtained by multiplying the bad credit of each level by the corresponding risk factor, and a sum of a value obtained by multiplying the basic total credit data of each level by the corresponding risk factor.
In one embodiment, as shown in fig. 7, determining the hierarchy of the object in each dimension according to a preset dimension, and determining the layering coefficient of the object based on the hierarchy of the object in each dimension includes:
s701, determining the unit dimension level of the object according to the unit dimension.
Specifically, a unit dimension is set according to a unit property of the object, and the unit dimension includes: national enterprises, government institutions, public institutions, troops, private enterprises, foreign enterprises and the like. And determining the unit level of the object according to the unit of the object. The national enterprise, government organization, public institution and troops prove that the work of the object is stable, and the unit level can be correspondingly set to be a higher unit level.
S702, determining the payroll level of the object according to the payroll level dimension of the object.
Specifically, after the unit of the object is determined, the unit is classified according to the payroll level of the object in the unit, the surrogated payrolls of all employees in the unit are sorted into a plurality of grades, and the grade of the object in the unit is further determined, wherein the higher the grade is, the higher the payroll level of the object is.
In some embodiments, the surrendering payroll levels may be divided into a, b, c, d, e, f, g, h, i, with several levels each matching a quantile value, see table 2, with higher quantile values indicating higher levels of payroll levels.
TABLE 2 payroll level hierarchy table
More than 90 minutesa
80-90 quantileb
70-80 quantilec
60-70 quantiled
50-60 quantilese
40-50 quantilef
30-40 quantilesg
20-30 quantilesh
Below 20 decimalsi
S703, determining the unit area level of the object according to the unit area dimension.
Specifically, a unit area dimension is determined according to a city where a unit of the object is located, and the unit area dimension includes: first-line city dimension, new first-line city dimension, second-line city dimension, third-line city dimension, four-line city dimension, and five-line city dimension. And determining unit area dimensions according to the city where the unit is located, wherein the unit area corresponding to the first-line city dimension and the new-line city dimension is higher in level.
S704, determining a generation month degree level according to the generation month dimension of the object.
Specifically, the number of months in which the object has received payroll since the last 12 months is counted, and if the object has received payroll for 3 months since the last 12 months, the number of months is 3. The generation month dimension includes: the generation and issue months are extremely short (the generation and issue months are 0-3), the generation and issue months are short (the generation and issue months are 4-6), the generation and issue months are common (the generation and issue months are 6-9), the generation and issue months are long (the generation and issue months are 9-11), and the generation and issue months are long (the generation and issue months are 12). And matching the corresponding generation month dimension according to the number of months in which the object has received payroll since 12 months, thereby determining a generation month degree hierarchy. The more the generation months, the higher the corresponding generation month level.
S705, determining a credit hierarchy according to the credit history length dimension of the object.
Specifically, the credit history length dimension of the object is obtained according to the application date of the object in the last loan minus the application date of the object in the first loan. The credit history length dimension is divided into 5 grades from low to high according to the time length, and the grades can be respectively as follows: the method is characterized in that the credit history length is extremely short (A gear), the credit history length is short (B gear), the credit history length is medium (C gear), the credit history length is long (D gear), and the credit history length is extremely long (E gear). The credit history length dimension can be graded by the skilled person according to self experience, and the time length of grading is not limited in the embodiment. A credit hierarchy is determined from the credit history length dimension. The shorter the credit history length, the higher the corresponding credit tier.
S706, determining a payment dimension level according to the historical payment expression dimension of the object.
Specifically, the historical repayment representing dimension is determined according to the repayment condition of the subject consumption loan, and the historical repayment representing dimension may include: after the history is over, normal clear (less than 3 strokes), normal clear (more than 3 strokes) and normal clear (more than 6 strokes). Normal clear (6 pens and above) and normal clear (3 pens and above), normal clear (6 pens and above) knot corresponds to repayment dimension level higher.
And S707, determining the hierarchical level of the object according to the unit dimension level, the payroll level, the unit area level, the generation month level, the credit level and the repayment dimension level, and determining the hierarchical coefficient according to the hierarchical level. The hierarchical levels may include: the system comprises a high-quality client 1, a high-quality client 2, a high-quality client 3, a common client 4, a common client 5 and a common client 6, wherein the high-quality client can obtain higher risk quota after being adjusted by a layering coefficient, and the risk quota is reduced after the common client is adjusted by the layering coefficient. And setting a layering coefficient corresponding to the layering level according to a rule preset by a person in the field or self experience. As shown in table 3.
TABLE 3 hierarchical level and hierarchical coefficient correspondence
Figure BDA0003363879650000211
Figure BDA0003363879650000221
It should be understood that, although the steps in the flowcharts in the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps of the flowcharts in the figures may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.
In one embodiment, as shown in fig. 8, there is provided a loan amount calculation apparatus 800 including: incomeamount determining module 801, liabilityamount determining module 802, basicamount calculating module 803, riskcoefficient calculating module 804, layeringcoefficient determining module 805, loanamount calculating module 806, wherein:
and an incomeamount determining module 801, configured to determine income amount data according to the financial information of the subject.
And a liability creditlimit determining module 802, configured to determine liability credit limit data according to the credit information of the object.
The basicamount calculating module 803 is used for calculating the difference between the income amount data and the liability amount data to obtain the basic amount data of the object.
The riskcoefficient calculation module 804 is configured to calculate a risk coefficient by using a pre-trained default probability model, a pre-set conversion coefficient, a basic total amount calculated according to the basic amount data, bad amount data determined according to an object whose loan is overdue, a preset risk constraint condition, and linear programming.
A layeringcoefficient determining module 805, configured to determine, according to a preset dimension, a hierarchy of an object in each dimension, and determine, based on the hierarchy of the object in each dimension, a layering coefficient of the object.
And the loanamount calculation module 806 is used for calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient.
In one embodiment of the apparatus, the apparatus further comprises: the device comprises an analysis processing module, a model evaluation module and a model determination module;
and the analysis processing module is used for analyzing and processing the target variables determined through the account age analysis to obtain the characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue.
And the model evaluation module is used for performing model fitting on the characteristic variables by using a logistic regression algorithm and performing model evaluation on the logistic regression model obtained by model fitting.
And the model determining module is used for determining the logistic regression model for model evaluation as the default probability model under the condition that the evaluation index is not lower than a first preset value and the stability index is not higher than a second preset value in the model evaluation performed by the model evaluation module.
In one embodiment of the apparatus, the analysis processing module includes: the system comprises a target variable determining module, a descriptive counting module and a data processing module;
and the target variable determining module is used for acquiring information data of the object for establishing the model, determining the target variable of the information data through account age analysis, and acquiring modeling data in the target variable, wherein the modeling data comprises owned data and third-party data acquired after the object is authorized.
And the descriptive statistic module is used for performing descriptive statistics on the modeling data.
The data processing module is used for carrying out data processing on the modeling data subjected to the descriptive statistics to obtain characteristic variables, and the data processing comprises the following steps: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
In one embodiment of the apparatus, the data processing module comprises: the device comprises an information value calculating module, an information value deleting module, a correlation coefficient calculating module and an obtaining module;
and the information value calculating module is used for calculating the information value of the modeling data.
And the information value deleting module is used for deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value.
And the correlation coefficient calculation module is used for calculating the correlation coefficient of the variable derived from the characteristics and the modeling data.
And the acquisition module is used for acquiring the modeling data with the largest information value in the modeling data with the correlation coefficient larger than the correlation coefficient threshold value.
In one embodiment of the apparatus, the riskfactor calculation module 804 comprises: the system comprises a default probability calculation module, a score determination module, a risk grade matching module and a risk coefficient calculation module;
and the default probability calculation module is used for calculating the default probability of the object by utilizing a pre-trained default probability model.
And the score determining module is used for determining the score of the object based on the probability and the conversion coefficient obtained by the fractional span calculation.
And the risk grade matching module is used for determining the risk grade of the object according to the score.
And the risk coefficient calculation module is used for calculating the basic total amount and bad amount data in the risk level and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the level.
In one embodiment of the apparatus, the risk factor calculation module comprises: the system comprises a basic total amount data calculation module, a bad amount data calculation module, a risk adjustment coefficient calculation module, a total amount bad rate calculation module and a risk coefficient determination module;
and the basic total amount data calculation module is used for calculating the sum of the basic amount data of the object in each risk level to obtain the basic total amount data of each level.
And the bad credit line data calculation module is used for calculating the sum of basic credit line data of objects with overdue loan in each risk level to obtain the bad credit line data of each level.
And the risk adjustment coefficient calculation module is used for calculating the risk adjustment coefficient of each risk grade according to the basic total amount data and the bad amount data of each risk grade and the preset risk constraint condition.
And the total amount reject ratio calculation module is used for calculating the total amount reject ratio through the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade.
And the risk coefficient determining module is used for determining the corresponding risk adjustment coefficient as the risk coefficient corresponding to the risk grade under the condition that the total limit reject ratio is minimum and the risk constraint condition is met.
In one embodiment of the apparatus, the risk factor calculating module further includes: a risk constraint condition setting module for setting risk constraint conditions, wherein the risk constraint conditions comprise: arranging risk coefficients according to the risk grades, and gradually reducing the risk coefficients; the difference between the adjacent risk coefficients is greater than or equal to a first preset difference; ranking the first and second risk coefficients greater than a first threshold, the risk coefficients outside the first and second rankings being less than the first threshold; and calculating the quota ratio of each risk level according to the risk adjustment coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
In one embodiment of the apparatus, the risk constraint setting module comprises: the quota ratio calculation module is used for calculating quota ratio of each risk level by adopting the following formula through the risk adjustment coefficient of each risk level and the basic total quota data:
Figure BDA0003363879650000251
wherein, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
In one embodiment of the apparatus, the total credit rejection rate calculating module is further configured to calculate the total credit rejection rate by using the following formula:
Figure BDA0003363879650000252
wherein, biBad credit data for the ith risk class, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
In one embodiment of the apparatus, the preset dimensions include: unit dimension, payroll horizontal dimension, unit zone dimension, generation month dimension, credit history length dimension, and history repayment performance dimension.
For the detailed implementation of the loan amount calculation device, reference may be made to the above embodiments of the loan amount calculation method, which are not described herein again. The modules in the loan amount calculation device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing income amount data, liability amount data, risk coefficient, layering coefficient and other data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a loan amount calculation method.
Those skilled in the art will appreciate that the configuration shown in fig. 9 is a block diagram of only a portion of the configuration associated with the disclosed aspects and does not constitute a limitation on the computing device to which the disclosed aspects apply, as a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above-described method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It should be noted that the information (including but not limited to financial information, credit information, etc.) and data (including but not limited to data for basic total amount, bad amount, etc.) of the subject related to the present disclosure are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in embodiments provided by the present disclosure may include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present disclosure, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the concept of the present disclosure, and these changes and modifications are all within the scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the appended claims.

Claims (22)

1. A loan amount calculation method, characterized in that the method comprises:
calculating the difference value between income limit data and liability limit data to obtain basic limit data of the object, wherein the income limit data is determined according to the financial information of the object, and the liability limit data is determined according to credit information of the object;
calculating to obtain a risk coefficient by a pre-trained default probability model, a preset fractional span, a basic total amount calculated according to the basic amount data, bad amount data determined according to the object with overdue loan, a preset risk constraint condition and linear programming;
determining the object level of each dimension according to preset dimensions, and determining the layering coefficient of the object based on the object level of each dimension;
and calculating the loan amount of the object by using the basic amount data, the risk coefficient and the layering coefficient.
2. The method of claim 1, wherein the method of training the default probability model comprises:
analyzing and processing target variables determined through account age analysis to obtain characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue;
performing model fitting on the characteristic variables by using a logistic regression algorithm, and performing model evaluation on a logistic regression model obtained by model fitting;
and under the condition that the evaluation index is not lower than a first preset value and the stability index is not higher than a second preset value in model evaluation, the logistic regression model for model evaluation is a default probability model.
3. The method of claim 2, wherein the analyzing the target variable determined by the account age analysis to obtain the characteristic variable comprises:
obtaining information data of an object for establishing a model, determining a target variable of the information data through account age analysis, and obtaining modeling data in the target variable, wherein the modeling data comprises self data and third party data obtained after the object is authorized;
performing descriptive statistics on the modeling data;
performing data processing on the modeling data subjected to descriptive statistics to obtain characteristic variables, wherein the data processing comprises the following steps: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
4. The method of claim 3, wherein the feature screening based on the information value and the correlation coefficient of the variable derived from the feature comprises:
calculating an information value of the modeling data;
deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value;
calculating correlation coefficients of variables derived through features and the modeling data;
and obtaining the modeling data with the largest information value in the modeling data with the correlation coefficient larger than the correlation coefficient threshold value.
5. The loan amount calculation method according to claim 1, wherein the risk coefficient is calculated by a pre-trained default probability model, a pre-set fractional span, a basic total amount calculated according to the basic amount data, bad amount data determined according to the object whose loan is overdue, a preset risk constraint condition and using linear programming, and comprises:
calculating the default probability of the object by utilizing the pre-trained default probability model;
determining a score for the subject based on the probability and a conversion coefficient calculated by a fractional span;
determining a risk level of the subject according to the score;
and calculating the basic total amount and the bad amount data in the risk grade, and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the grade.
6. The loan amount calculation method according to claim 2, wherein the calculating the basic total amount and the bad amount data in the risk level and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the level comprises:
calculating the sum of basic quota data of the object in each risk level to obtain basic total quota data of each level;
calculating the sum of basic limit data of objects with overdue loan in each risk level to obtain bad limit data of each level;
calculating to obtain a risk adjustment coefficient of each risk grade according to the basic total line data and the bad line data of each risk grade and a preset risk constraint condition;
calculating the reject ratio of the total quota according to the basic total quota data, the bad quota data and the risk adjustment coefficient of each risk grade;
and under the condition that the total amount reject ratio is minimum and the risk constraint condition is met, the corresponding risk adjustment coefficient is the risk coefficient corresponding to the risk grade.
7. The method of claim 6, wherein the predefined risk constraints include:
arranging the risk coefficients according to the risk grades, and gradually reducing the risk coefficients;
the difference between the adjacent risk coefficients is greater than or equal to a first preset difference;
ranking the first and second risk coefficients greater than a first threshold, the risk coefficients outside the first and second rankings being less than a first threshold;
and calculating the quota ratio of each risk level according to the risk adjustment coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
8. The loan amount calculation method according to claim 7, wherein the amount ratio of each risk level is calculated by using the risk adjustment coefficient of each risk level and the basic total amount data by using the following formula:
Figure FDA0003363879640000031
the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade are calculated by adopting the following formula to obtain the total amount bad rate:
Figure FDA0003363879640000032
wherein, biBad credit data for the ith risk class, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
9. The method of claim 1, wherein the predetermined dimensions include: unit dimension, payroll horizontal dimension, unit zone dimension, generation month dimension, credit history length dimension, and history repayment performance dimension.
10. A loan amount calculation apparatus, comprising:
the income amount determining module is used for determining income amount data according to the financial information of the object;
the liability amount determining module is used for determining liability amount data according to the credit information of the object;
the basic quota calculation module is used for calculating the difference value of the income quota data and the liability quota data to obtain basic quota data of the object;
the risk coefficient calculation module is used for calculating a risk coefficient through a pre-trained default probability model, a preset conversion coefficient, a basic total amount obtained through calculation according to the basic amount data, bad amount data determined according to the object with overdue loan, a preset risk constraint condition and linear programming;
the hierarchical coefficient determining module is used for determining the object hierarchy of each dimension according to preset dimensions and determining the hierarchical coefficient of the object based on the object hierarchy of each dimension;
and the loan amount calculation module is used for calculating the loan amount of the object by utilizing the basic amount data, the risk coefficient and the layering coefficient.
11. The loan amount calculation apparatus according to claim 10, further comprising: the device comprises an analysis processing module, a model evaluation module and a model determination module;
the analysis processing module is used for analyzing and processing target variables determined through account age analysis to obtain characteristic variables, wherein the target variables comprise objects with loan overdue exceeding the first time and objects with loan not overdue;
the model evaluation module is used for performing model fitting on the characteristic variables by using a logistic regression algorithm and performing model evaluation on a logistic regression model obtained by model fitting;
the model determining module is used for determining that the logistic regression model for model evaluation is a default probability model under the condition that the evaluation index of the model evaluation module is not lower than a first preset value and the stability index is not higher than a second preset value in model evaluation.
12. The loan amount calculation apparatus according to claim 11, wherein the analysis processing module includes: the system comprises a target variable determining module, a descriptive counting module and a data processing module;
the target variable determining module is used for acquiring information data of an object for establishing a model, determining a target variable of the information data through account age analysis, and acquiring modeling data in the target variable, wherein the modeling data comprises self data and third party data acquired after the object is authorized;
the descriptive statistic module is used for performing descriptive statistic on the modeling data;
the data processing module is configured to perform data processing on the modeling data subjected to the descriptive statistics to obtain a characteristic variable, where the data processing includes: deleting repeated values, processing abnormal values, processing missing values, standardizing data, deriving characteristics, separating variables into boxes, converting evidence weights, and screening characteristics according to information values and correlation coefficients of the variables derived through the characteristics.
13. The loan amount calculation apparatus according to claim 12, wherein the data processing module includes: the device comprises an information value calculating module, an information value deleting module, a correlation coefficient calculating module and an obtaining module;
the information value calculating module is used for calculating the information value of the modeling data;
the information value deleting module is used for deleting the modeling data corresponding to the information value smaller than the first information threshold value or the information value larger than the second information threshold value;
the correlation coefficient calculation module is used for calculating the correlation coefficient of the variable derived from the characteristics and the modeling data;
the obtaining module is configured to obtain modeling data with a largest information value in the modeling data with the correlation coefficient larger than the correlation coefficient threshold.
14. The loan amount calculation apparatus according to claim 11, wherein the risk coefficient calculation module includes: the system comprises a default probability calculation module, a score determination module, a risk grade matching module and a risk coefficient calculation module;
the default probability calculation module is used for calculating the probability of the object default by utilizing the pre-trained default probability model;
the score determining module is used for determining the score of the object based on the probability and a conversion coefficient obtained by fractional span calculation;
the risk level matching module is used for determining the risk level of the object according to the score;
and the risk coefficient calculation module is used for calculating the basic total amount and the bad amount data in the risk grade and performing linear programming calculation according to a preset risk constraint condition to obtain a risk coefficient corresponding to the grade.
15. The loan amount calculation apparatus according to claim 14, wherein the risk coefficient calculation module includes: the system comprises a basic total amount data calculation module, a bad amount data calculation module, a risk adjustment coefficient calculation module, a total amount bad rate calculation module and a risk coefficient determination module;
the basic total amount data calculation module is used for calculating the sum of the basic amount data of the object in each risk level to obtain the basic total amount data of each level;
the bad credit line data calculation module is used for calculating the sum of basic credit line data of objects with overdue loan in each risk level to obtain bad credit line data of each level;
the risk adjustment coefficient calculation module is used for calculating a risk adjustment coefficient of each risk grade according to the basic total amount data and the bad amount data of each risk grade and a preset risk constraint condition;
the total amount reject ratio calculation module is used for calculating the total amount reject ratio through the basic total amount data, the bad amount data and the risk adjustment coefficient of each risk grade;
and the risk coefficient determining module is used for determining the risk adjustment coefficient corresponding to the risk grade under the condition that the total limit reject ratio is minimum and the risk constraint condition is met.
16. The loan amount calculation apparatus according to claim 14 or 15, wherein the risk coefficient calculation module further comprises: a risk constraint condition setting module, configured to set risk constraint conditions, where the risk constraint conditions include: arranging the risk coefficients according to the risk grades, and gradually reducing the risk coefficients; the difference between the adjacent risk coefficients is greater than or equal to a first preset difference; ranking the first and second risk coefficients greater than a first threshold, the risk coefficients outside the first and second rankings being less than a first threshold; and calculating the quota ratio of each risk level according to the risk adjustment coefficient of each risk level and the basic total quota data, wherein the sum of the quota ratios of each risk level is a first preset percentage.
17. The loan amount calculation apparatus according to claim 16, wherein the risk constraint setting module includes: the quota ratio calculation module is used for calculating quota ratio of each risk level by adopting the following formula through the risk adjustment coefficient of each risk level and the basic total quota data:
Figure FDA0003363879640000061
wherein, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
18. The loan amount calculation device of claim 16, wherein the total amount reject ratio calculation module is further configured to calculate the total amount reject ratio by using the following formula:
Figure FDA0003363879640000062
wherein, biBad credit for ith risk levelData, XiRisk coefficient for the ith risk class, aiThe basic total amount data of the ith risk level.
19. The loan amount calculation apparatus according to claim 16, wherein the predetermined dimensions in the layer coefficient determination module include: unit dimension, payroll horizontal dimension, unit zone dimension, generation month dimension, credit history length dimension, and history repayment performance dimension.
20. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 9 when executing the computer program.
21. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
22. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 9 when executed by a processor.
CN202111375611.0A2021-11-192021-11-19Loan amount calculation method and device, computer equipment and storage mediumPendingCN114219611A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111375611.0ACN114219611A (en)2021-11-192021-11-19Loan amount calculation method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111375611.0ACN114219611A (en)2021-11-192021-11-19Loan amount calculation method and device, computer equipment and storage medium

Publications (1)

Publication NumberPublication Date
CN114219611Atrue CN114219611A (en)2022-03-22

Family

ID=80697613

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111375611.0APendingCN114219611A (en)2021-11-192021-11-19Loan amount calculation method and device, computer equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN114219611A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115099942A (en)*2022-06-282022-09-23中国银行股份有限公司 Loan risk control method and device for bank customers
CN116954591A (en)*2023-06-152023-10-27天云融创数据科技(北京)有限公司Generalized linear model training method, device, equipment and medium in banking field
CN119539943A (en)*2025-01-222025-02-28杭银消费金融股份有限公司 A loan evaluation method and system based on multi-objective optimization
CN119624634A (en)*2025-02-132025-03-14北京银行股份有限公司 Intelligent management method and system based on deep learning technology

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115099942A (en)*2022-06-282022-09-23中国银行股份有限公司 Loan risk control method and device for bank customers
CN116954591A (en)*2023-06-152023-10-27天云融创数据科技(北京)有限公司Generalized linear model training method, device, equipment and medium in banking field
CN116954591B (en)*2023-06-152024-02-23天云融创数据科技(北京)有限公司Generalized linear model training method, device, equipment and medium in banking field
CN119539943A (en)*2025-01-222025-02-28杭银消费金融股份有限公司 A loan evaluation method and system based on multi-objective optimization
CN119624634A (en)*2025-02-132025-03-14北京银行股份有限公司 Intelligent management method and system based on deep learning technology

Similar Documents

PublicationPublication DateTitle
CN110837931B (en)Customer churn prediction method, device and storage medium
CN114219611A (en)Loan amount calculation method and device, computer equipment and storage medium
CN112633962B (en)Service recommendation method and device, computer equipment and storage medium
CN110909984B (en)Business data processing model training method, business data processing method and device
Van Thiel et al.Artificial intelligence credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
CN111105266B (en)Client grouping method and device based on improved decision tree
EP1361526A1 (en)Electronic data processing system and method of using an electronic processing system for automatically determining a risk indicator value
CN114078050A (en)Loan overdue prediction method and device, electronic equipment and computer readable medium
CN112200659A (en)Method and device for establishing wind control model and storage medium
CN111709826A (en)Target information determination method and device
CN116128635A (en)Credit scoring card development method and terminal based on optimal variable box division algorithm
CN110728301A (en)Credit scoring method, device, terminal and storage medium for individual user
CN115689713A (en)Abnormal risk data processing method and device, computer equipment and storage medium
CN115099933A (en)Service budget method, device and equipment
CN114626940A (en)Data analysis method and device and electronic equipment
CN117952688A (en)Classification method and device for merchants and electronic equipment
CN117575773A (en)Method, device, computer equipment and storage medium for determining service data
CN117372155A (en)Credit risk processing method, apparatus, computer device and storage medium
CN117273919A (en)Internet credit assessment method for individual households and small micro-enterprise e-commerce merchants
CN117172910A (en)Credit evaluation method and device based on EBM model, electronic equipment and storage medium
CN115660822A (en)Wind control strategy processing method and device for financial business, electronic equipment and storage medium
CN118195247B (en) Task allocation method, device, computer equipment and storage medium
Nazari et al.Using the Hybrid Model for Credit Scoring (Case Study: Credit Clients of microloans, Bank Refah-Kargeran of Zanjan, Iran)
CN117994017A (en)Method for constructing retail credit risk prediction model and online credit service Scoredelta model
CN118229402A (en)Method for constructing retail credit risk prediction model and retail credit Scoresigmam a model

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp