CN113393045A

Movatterモバイル変換

Info

Publication number: CN113393045A
Application number: CN202110694801.2A
Authority: CN
Inventors: 杨珂; 陈永录; 李变
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2021-09-14

Abstract

The utility model provides a user satisfaction prediction model determination method, which comprises the following steps: obtaining sample data formed based on a historical service work order, wherein the sample data comprises a user satisfaction degree label and at least one candidate associated characteristic influencing the user satisfaction degree; constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight aiming at each candidate associated feature; screening a feature set with feature selection weights meeting preset conditions from at least one candidate associated feature to obtain a target associated feature which is obviously related to user satisfaction; and constructing a user satisfaction prediction model by taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics and combining with the characteristic selection weights aiming at the target associated characteristics. The present disclosure also provides a user satisfaction prediction model determination device, a user satisfaction prediction method, an electronic device, and a computer storage medium.

Description

User satisfaction prediction model determining method and device

Technical Field

The present disclosure relates to the field of big data technologies, and in particular, to a user satisfaction prediction model determining method and apparatus, and a user satisfaction prediction method.

Background

Data mining technology has wide application in different business fields, for example, it extracts key data for assisting business decision from a financial business database by extracting, converting, analyzing and modeling numerous data in the database. In particular, training a user satisfaction prediction model by using a data mining technology is one of important means for improving the quality of financial services.

In the process of implementing the technical scheme, the inventor finds that in the related art, when a user satisfaction prediction model is trained by using a data mining technology, the model has the defects of overfitting training, high model complexity and poor generalization capability due to the fact that model training is excessively pursued to be good, so that the user satisfaction prediction efficiency is low, and the prediction effect is poor.

Disclosure of Invention

One aspect of the present disclosure provides a user satisfaction prediction model determination method, including: obtaining sample data formed based on a historical service work order, wherein the sample data comprises a user satisfaction degree label and at least one candidate associated feature influencing the user satisfaction degree; constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight for each candidate associated feature; screening a feature set with feature selection weights meeting preset conditions from the at least one candidate associated feature to obtain a target associated feature which is obviously related to the user satisfaction degree; and taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics, and combining the characteristic selection weights aiming at the target associated characteristics to construct a user satisfaction prediction model.

Optionally, the constructing a linear regression cost function according to the user satisfaction label and the feature information value of each candidate associated feature to determine a feature selection weight for each candidate associated feature includes: taking the user satisfaction in the sample data as a prediction target, taking the at least one candidate associated feature as an input feature, and constructing a linear fitting error function; calculating the similarity between the input features of different sample data; constructing a first regularization term to define a feature selection weight for the input features such that the feature selection weight is inversely proportional to the similarity; and constructing the linear regression cost function according to the linear fitting error function and the first regular term.

Optionally, the constructing a linear fitting error function by using the user satisfaction in the sample data as a prediction target and the at least one candidate associated feature as an input feature includes: constructing the linear fit error function using the following equation:

wherein f (w) is a linear fitting error function, m is the total number of samples, y_iIs the user satisfaction value, x, of the ith sample_iIs the candidate associated feature vector of the ith sample,

a weight vector is selected for the features of the ith sample, and n is the total number of candidate associated features.

Optionally, the first regularization term is constructed using the following formula:

J₁＝∑_(i、j)∈mr_ij||w_i-w_j||₁；

wherein, J₁Is a first regular term, i and j are sample numbers, m is the total number of samples, r_ijIs the similarity between the ith and jth samples, w_iSelecting a weight vector, w, for the features of the ith sample_jSelecting a weight vector for the features of the jth sample, | |. luminance₁Is L₁And (4) norm.

Optionally, the constructing the linear regression cost function according to the linear fitting error function and the first regularization term includes: constructing the linear regression cost function using the following formula:

wherein λ is₁Is a preset weighting coefficient.

Optionally, the constructing the linear regression cost function according to the linear fitting error function and the first regularization term includes: constructing a second regularization term to narrow the number of input features; and constructing the linear regression cost function by adopting a cross validation method according to the linear fitting error function, the first regular term and the second regular term.

Optionally, the second regularization term is constructed using the following formula:

wherein λ is₂For the penalty parameter, m is the total number of samples, | | |. the luminance |₁Is L₁Norm, w_iA weight vector is selected for the features of the ith sample.

Optionally, the constructing the linear regression cost function according to the linear fitting error function, the first regular term, and the second regular term by using a cross validation method includes:

constructing the linear regression cost function using the following formula:

dividing the test data in the sample data into k parts, traversing the preset value range of the penalty parameters, using k-1 parts of data to train the linear regression cost function, using the remaining 1 parts of data to evaluate the prediction effect of the linear regression cost function, and obtaining the penalty parameters when the prediction effect of the linear regression cost function is optimalλ₂。

Optionally, the determining a feature selection weight for each candidate associated feature comprises: and substituting the test data in the sample data into the linear regression cost function, and solving the feature selection weight in the linear regression cost function by adopting an iterative least square method.

Another aspect of the present disclosure provides a user satisfaction prediction method, including: acquiring interactive data formed based on a bank service work order; collecting characteristic information values of target associated characteristics in the interactive data; and inputting the characteristic information value into the user satisfaction prediction model obtained in the embodiment to obtain a user satisfaction prediction value.

Another aspect of the present disclosure provides a user satisfaction prediction model determination apparatus, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring sample data formed based on a historical service work order, and the sample data comprises a user satisfaction degree label and at least one candidate associated feature influencing the user satisfaction degree; the first processing module is used for constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight for each candidate associated feature; the second processing module is used for screening a feature set of which the feature selection weight meets a preset condition from the at least one candidate associated feature to obtain a target associated feature obviously related to the user satisfaction degree; and the third processing module is used for taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics, and combining the characteristic selection weights aiming at the target associated characteristics to construct a user satisfaction prediction model.

Optionally, the first processing module includes: the first processing submodule is used for taking the user satisfaction degree in the sample data as a prediction target, taking the at least one candidate associated feature as an input feature and constructing a linear fitting error function; the second processing submodule is used for calculating the similarity between the input features of different sample data; a third processing sub-module for constructing a first regularization term to define a feature selection weight for the input features such that the feature selection weight is inversely proportional to the similarity; and the fourth processing submodule is used for constructing the linear regression cost function according to the linear fitting error function and the first regular term.

Optionally, the first processing sub-module is configured to: constructing the linear fit error function using the following equation:

Optionally, the third processing sub-module is configured to: constructing the first regularization term using the following equation:

J₁＝∑_(i、j)∈mr_ij||w_i-w_j||₁；

Optionally, the fourth processing submodule includes: a first processing unit, configured to construct the linear regression cost function using the following formula:

wherein λ is₁Is a preset weighting coefficient.

Optionally, the fourth processing submodule includes: a second processing unit for constructing a second regularization term to narrow the number of input features; and the third processing unit is used for constructing the linear regression cost function by adopting a cross validation method according to the linear fitting error function, the first regular term and the second regular term.

Optionally, the second processing unit is configured to: constructing the second regularization term using the following equation:

Optionally, the third processing unit is configured to:

constructing the linear regression cost function using the following formula:

dividing the test data in the sample data into k parts, traversing the preset value range of the penalty parameter, using k-1 parts of data to train the linear regression cost function, using the remaining 1 parts of data to evaluate the prediction effect of the linear regression cost function, and obtaining the penalty parameter lambda when the prediction effect of the linear regression cost function is optimal₂。

Optionally, the first processing module further includes: and the fifth processing submodule is used for substituting the test data in the sample data into the linear regression cost function and solving the feature selection weight in the linear regression cost function by adopting an iterative least square method.

Another aspect of the present disclosure provides a user satisfaction predicting apparatus, including: the second acquisition module is used for acquiring interactive data formed based on the bank service work order; the fourth processing module is used for acquiring a characteristic information value of the target associated characteristic in the interactive data; and the fifth processing module is used for inputting the characteristic information value into the user satisfaction prediction model obtained in the embodiment to obtain a user satisfaction prediction value.

Another aspect of the present disclosure provides an electronic device comprising one or more processors; memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods of embodiments of the present disclosure.

Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions that, when executed, implement the method of embodiments of the present disclosure.

Another aspect of the present disclosure provides a computer program product comprising computer readable instructions, wherein the computer readable instructions are configured to perform a user satisfaction prediction model determination method and a user satisfaction prediction method of the embodiments of the present disclosure when executed.

Drawings

For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which,

fig. 1 schematically illustrates a system architecture of a user satisfaction prediction model determination method and apparatus according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart of a user satisfaction prediction model determination method in accordance with an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart of another user satisfaction prediction model determination method in accordance with an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of a user satisfaction prediction method in accordance with an embodiment of the present disclosure;

FIG. 5 schematically illustrates a block diagram of a user satisfaction prediction model determination apparatus according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a block diagram of a user satisfaction prediction apparatus, in accordance with an embodiment of the present disclosure;

fig. 7 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It is to be understood that such description is merely illustrative and not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, operations, and/or components, but do not preclude the presence or addition of one or more other features, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable user satisfaction prediction model determining apparatus such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system.

Embodiments of the present disclosure provide a user satisfaction prediction model determination method and a processing apparatus to which the method can be applied. The method comprises the steps of firstly obtaining sample data formed based on a historical service work order, wherein the sample data comprises a user satisfaction degree label and at least one candidate associated feature influencing the user satisfaction degree, then constructing a linear regression cost function according to the user satisfaction degree label and a feature information value of each candidate associated feature to determine a feature selection weight aiming at each candidate associated feature, then screening a feature set with the feature selection weight meeting a preset condition in at least one candidate associated feature to obtain a target associated feature obviously related to the user satisfaction degree, finally taking the user satisfaction degree as a prediction target, taking the target associated feature as an independent variable feature, and combining the feature selection weight aiming at each target associated feature to construct a user satisfaction degree prediction model.

Fig. 1 schematically shows a system architecture of a user satisfaction prediction model determination method and apparatus according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1,system architecture 100 may include

database servers

101, 102, 103,network 104, andmanagement server 105.Network 104 is the medium used to provide communication links between

database servers

101, 102, 103 andmanagement server 105.Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. Themanagement server 105 may be an independent physical server, a server cluster or a distributed system including a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud computing, web services, and middleware services.

First, themanagement server 105 obtains from a database server (e.g.,

database servers

101, 102, 103) sample data formed based on the historical service work order, the sample data including a user satisfaction tag and at least one candidate associated feature affecting user satisfaction, themanagement server 105 then constructs a linear regression cost function based on the user satisfaction tag and the feature information value of each candidate associated feature, to determine a feature selection weight for each candidate associated feature, and screening a feature set with feature selection weight meeting preset conditions from at least one candidate associated feature to obtain a target associated feature which is obviously related to the user satisfaction degree, and finally, themanagement server 105 constructs a user satisfaction prediction model by using the user satisfaction as a prediction target, using the target-associated features as independent variable features, and combining feature selection weights for the target-associated features.

It should be noted that the user satisfaction prediction model determination method and apparatus according to the embodiments of the present disclosure may be used in the financial field, and may also be used in any field other than the financial field. The present disclosure will be described in detail below with reference to the drawings and specific embodiments.

Fig. 2 schematically illustrates a flowchart of a user satisfaction prediction model determination method according to an embodiment of the present disclosure, and as shown in fig. 2, themethod 200 may include operations S210 to S240.

In operation S210, sample data formed based on the historical service work order is obtained, where the sample data includes a user satisfaction tag and at least one candidate associated feature that affects user satisfaction.

In this embodiment, specifically, sample data formed based on the historical service work order is acquired from the database server, where the sample data includes a user satisfaction tag indicating a satisfaction level of the sample user for the historical service, which may be embodied specifically by a user satisfaction score, where the user satisfaction score includes, for example, 5 satisfaction values of 1, 2, 3, 4, and 5. The user satisfaction score may be given by the sample user at the end of the history service, or may be obtained based on the questionnaire survey result, which is not limited in the present application.

The sample data also comprises at least one candidate associated characteristic influencing the satisfaction degree of the user, wherein the candidate associated characteristic comprises user attribute information of a historical service object, historical service information provided by the platform for the sample user and customer service attribute information providing historical service for the sample user. The user attribute information may include, for example, information such as user age, user gender, and region where the user is located, the historical service information may include, for example, information such as interactive voice response waiting duration, repeated incoming line amount, waiting hang-up times, single incoming line interaction duration, transaction service type, and user high-frequency vocabulary, and the service attribute information may include, for example, information such as service age, service gender, and service high-frequency vocabulary.

Based on a preset division ratio, dividing the acquired sample data into a training set and a test set, wherein the training set is used for training parameters of the user satisfaction prediction model, and the test set is used for verifying experience errors of the user satisfaction prediction model. Optionally, before performing data set partitioning on the sample data, performing preprocessing on the sample data, where the preprocessing includes, for example, filtering repeated data appearing in the sample data; screening abnormal data appearing in the sample data; interpolating missing data appearing in the sample data; and performing characteristic value processing on the non-standard data appearing in the sample data to obtain a characteristic information value of the candidate associated characteristic in the sample data. The feature value processing may include type conversion, normalization processing, digitization processing, vector conversion processing, and the like, where the type conversion may specifically unify data from different sources in the same reference system, and the digitization processing may specifically convert a non-numerical-type candidate associated feature into a numerical-type equivalent feature, for example, convert a text-form candidate associated feature into a corresponding feature vector through a natural language coding method.

Next, in operation S220, a linear regression cost function is constructed according to the user satisfaction label and the feature information value of each candidate associated feature to determine a feature selection weight for each candidate associated feature.

In this embodiment, specifically, a linear fitting error function is constructed by taking the user satisfaction in the sample data as a prediction target and at least one candidate associated feature as an input feature.

Specifically, a linear fit error function is constructed using the following equation:

After the linear fitting error function is obtained, in order to prevent the user satisfaction prediction model constructed based on the linear fitting error function from generating an overfitting phenomenon, the regular item is continuously constructed on the basis of the linear fitting error function, the characteristic dimension of sample data is reduced by adding additional parameters, the number of model parameters is reduced, and the complexity of the model is limited.

Specifically, on the basis of a linear fitting error function, a first regular term is continuously constructed to limit feature selection weights for input features so that the feature selection weights are in inverse proportion to the similarity among sample data, and a second regular term is constructed to shrink the number of the input features to obtain a linear regression cost function. And then substituting the test data in the sample data into the linear regression cost function, and solving the feature selection weight in the linear regression cost function by adopting an iterative least square method to obtain the feature selection weight associated with each candidate associated feature.

Next, in operation S230, in at least one candidate associated feature, a feature set with a feature selection weight meeting a preset condition is screened, so as to obtain a target associated feature significantly related to user satisfaction.

In this embodiment, specifically, in at least one candidate associated feature of the sample data, according to the feature selection weight associated with each candidate associated feature, a feature set with the feature selection weight higher than a preset threshold is screened, so as to obtain a target associated feature significantly related to the user satisfaction. Illustratively, the feature selection weight indicates a correlation coefficient between the candidate associated features and the user satisfaction, and among 10 candidate associated features associated with the user satisfaction, a feature set with the correlation coefficient greater than 0.6 is screened to obtain 6 target associated features significantly related to the user satisfaction, so that the feature space dimension of the sample data is reduced from 10 to 6.

Next, in operation S240, a user satisfaction prediction model is constructed using the user satisfaction as a prediction target and the target-associated features as argument features in combination with the feature selection weights for the respective target-associated features.

In this embodiment, specifically, the user satisfaction is used as a prediction target, the determined at least one target associated feature is used as an independent variable feature, and a user satisfaction prediction model is constructed by combining feature selection weights for the target associated features. Specifically, a linear equation y ═ w is constructed based on the prediction target and the independent variable characteristics^TAnd x, obtaining a user satisfaction prediction model. Where y is the prediction target and x ═ x_k，k＝1，2，...，l]For the target associated feature vector, w^T＝[w_k，k＝1，2，...，l]^TL is the total number of target associated features, w_kFor associating features with respect to the kth targetThe characteristic of the token selects a weight.

After the user satisfaction prediction model is obtained through training, interactive data formed based on the bank service work order are obtained, and characteristic information values of target associated characteristics in the interactive data are collected. And then, inputting the characteristic information value into a user satisfaction prediction model to obtain a user satisfaction prediction value so as to realize deeper understanding of user requirements, provide guidance for seat job-entering training and seat to customer service of telephone banking business, and be beneficial to realizing telephone banking customer service with higher quality and better meeting the user requirements.

Through the embodiment of the disclosure, sample data formed based on a historical service work order is obtained, wherein the sample data comprises a user satisfaction degree label and at least one candidate associated characteristic influencing the user satisfaction degree; constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight aiming at each candidate associated feature; screening a feature set with feature selection weights meeting preset conditions from at least one candidate associated feature to obtain a target associated feature which is obviously related to user satisfaction; and constructing a user satisfaction prediction model by taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics and combining with the characteristic selection weights aiming at the target associated characteristics. The method comprises the steps of determining feature selection weights for each candidate associated feature by constructing a linear regression cost function, and screening target associated features meeting preset conditions from at least one candidate associated feature based on the feature selection weights. By reducing the characteristic space dimension of sample data and compressing the number of parameters of the prediction model, the over-fitting problem of the prediction model is effectively prevented, so that the user satisfaction prediction efficiency is improved, the user satisfaction prediction effect is improved, and the method is favorable for providing more instructive reference information for the seat entry training and the seat to the customer service of telephone banking services.

Fig. 3 schematically shows a flowchart of still another user satisfaction prediction model determination method according to an embodiment of the present disclosure, and as shown in fig. 3, operation S220 may include operations S310 to S350.

In operation S310, a linear fitting error function is constructed using the user satisfaction in the sample data as a prediction target and at least one candidate associated feature as an input feature.

In the present embodiment, specifically, a linear fitting error function is constructed using the following equation:

Next, in operation S320, a similarity between input features of different sample data is calculated.

In this embodiment, specifically, when selecting candidate associated features, it is necessary to evaluate the individual difference between different sample data to screen out sample data with significant expression difference. And taking at least one candidate associated feature as an input feature, and calculating the similarity between the input features of different sample data to be used as the similarity between different sample data. Alternatively, the distance between the input features of different sample data is calculated, for example, minkowski distance, euclidean distance, manhattan distance, mahalanobis distance, etc. between different input features are calculated, and further, the cosine of the included angle between different input features, the vector inner product, etc. may be calculated to obtain the similarity between the input features of different sample data.

Optionally, after calculating the similarity between the input features of different sample data, screening candidate associated features with similarity lower than a preset threshold and having significant expression difference. And in the candidate associated feature set with the significant expression difference, carrying out Cox univariate regression analysis on user satisfaction by using the candidate associated feature set, and screening candidate associated features with high correlation degree with the user satisfaction by setting a statistical threshold. And then, constructing a subsequent first regular term and a second regular term based on the screened candidate associated features with higher correlation degree.

Next, in operation S330, a first regularization term is constructed to define a feature selection weight for the input features such that the feature selection weight is inversely proportional to the similarity.

In the present embodiment, specifically, the first regularization term is constructed using the following formula:

J₁＝∑_(i、j)∈mr_ij||w_i-w_j||₁；

wherein, J₁Is a first regular term, i and j are sample numbers, m is the total number of samples, r_ijIs the similarity between the ith and jth samples, w_iSelecting a weight vector, w, for the features of the ith sample_jSelecting a weight vector for the features of the jth sample, | |. luminance₁Is L₁Norm, | x | luminance₁＝∑_i|x_i|，x_iAre the elements that make up x.

Next, in operation S340, a second regularization term is constructed to narrow the number of input features.

In the present embodiment, specifically, the second regularization term is constructed using the following formula:

Next, in operation S350, a linear regression cost function is constructed by using a cross-validation method according to the linear fitting error function, the first regularization term and the second regularization term.

In the present embodiment, specifically, the linear regression cost function is constructed using the following formula:

wherein J is LASSO (last Absolute Shrinkage and Selection Operator) regression cost function, i.e. linear regression cost function, f (w) is linear fitting error function, m is total number of samples, y is_iIs the user satisfaction value, x, of the ith sample_iIs the candidate associated feature vector of the ith sample,

J₁Is a first regular term, i and j are sample numbers, m is the total number of samples, r_ijIs the similarity between the ith and jth samples, w_iSelecting a weight vector, w, for the features of the ith sample_jSelecting a weight vector for the features of the jth sample, | |. luminance₁Is L₁Norm, λ₁To preset weighting coefficients, J₂Is a second regularization term, λ₂Is a penalty parameter.

In calculating the feature selection weight for each target associated feature, it is necessary to make the linear regression cost function J as small as possible, that is, to make

As small as possible, so that λ needs to be defined₁∑_(i，j)∈mr_ij||w_i-w_j||₁And

the value of (c) is also made as small as possible. Therefore, when the similarity of any two sample data is high, the corresponding feature selection weight direction is selectedThe closer the quantities should be to achieve a locality of the salient sample data.

As for lambda₂Dividing the test data in the sample data into k parts, and traversing a preset value range of the penalty parameter (for example, the preset value range is [0, 0.1, 0.2., 0.9, 1 ]]) Using k-1 part of data to train linear regression cost function, using the rest 1 part of data to evaluate the prediction effect of linear regression cost function, and obtaining punishment parameter lambda when the prediction effect of linear regression cost function is optimal₂。λ₂A pre-constraint penalty factor, which is a second regularization term, for measuring the severity of the input feature selection, λ₂The larger the input features selection, the more stringent the number of features participating in the predictive model training.

By increasing L₁And constraining, namely solving multiple collinearity and overfitting problems by utilizing a linear regression cost function, and realizing selection and sparsification processing of input features in the prediction model training by compressing the selection weight of a part of candidate associated features into 0. The method is not only beneficial to reducing the complexity of the prediction model training and improving the generalization capability of the prediction model, but also beneficial to improving the interpretability of the user satisfaction prediction and improving the efficiency of the user satisfaction prediction.

Fig. 4 schematically shows a flowchart of a user satisfaction prediction method according to an embodiment of the present disclosure, and as shown in fig. 4, themethod 400 may include operations S410 to S430.

In operation S410, interactive data formed based on a banking work order is acquired.

Next, in operation S420, feature information values of the target associated features in the interaction data are collected.

Next, in operation S430, the characteristic information value is input into the user satisfaction prediction model, and a user satisfaction prediction value is obtained.

Fig. 5 schematically shows a block diagram of a user satisfaction prediction model determination apparatus according to an embodiment of the present disclosure.

As shown in fig. 5, theapparatus 500 includes a first obtainingmodule 501, afirst processing module 502, asecond processing module 503, and athird processing module 504.

Specifically, the first obtainingmodule 501 is configured to obtain sample data formed based on a historical service work order, where the sample data includes a user satisfaction tag and at least one candidate association feature that affects user satisfaction; afirst processing module 502, configured to construct a linear regression cost function according to the user satisfaction tag and the feature information value of each candidate associated feature, so as to determine a feature selection weight for each candidate associated feature; thesecond processing module 503 is configured to filter a feature set, of which the feature selection weight meets a preset condition, from the at least one candidate associated feature to obtain a target associated feature significantly related to the user satisfaction; thethird processing module 504 is configured to use the user satisfaction as a prediction target, use the target association feature as an independent variable feature, and combine feature selection weights for the target association features to construct a user satisfaction prediction model.

As a possible embodiment, the first processing module includes: the first processing submodule is used for taking the user satisfaction degree in the sample data as a prediction target, taking at least one candidate associated feature as an input feature and constructing a linear fitting error function; the second processing submodule is used for calculating the similarity between the input characteristics of different sample data; a third processing sub-module for constructing a first regularization term to define a feature selection weight for the input features such that the feature selection weight is inversely proportional to the similarity; and the fourth processing submodule is used for constructing a linear regression cost function according to the linear fitting error function and the first regular term.

As a possible embodiment, the first processing submodule is configured to: a linear fit error function is constructed using the following equation:

As a possible embodiment, the third processing submodule is configured to: the first regularization term is constructed using the following equation:

J₁＝∑_(i、j)∈mr_ij||w_i-w_j||₁；

wherein, J₁Is a first regular term, i and j are sample numbers, m is the total number of samples, r_ijIs the similarity between the ith and jth samples, w_iSelecting weights for features of ith sampleVector, w_jSelecting a weight vector for the features of the jth sample, | |. luminance₁Is L₁And (4) norm.

As a possible embodiment, the fourth processing submodule includes: a first processing unit for constructing a linear regression cost function using the following formula:

wherein λ is₁Is a preset weighting coefficient.

As a possible embodiment, the fourth processing submodule includes: a second processing unit for constructing a second regularization term to narrow the number of input features; and the third processing unit is used for constructing a linear regression cost function by adopting a cross validation method according to the linear fitting error function, the first regular term and the second regular term.

As a possible embodiment, the second processing unit is configured to: the second regularization term is constructed using the following equation:

As a possible embodiment, the third processing unit is configured to:

constructing a linear regression cost function using the following formula:

dividing test data in the sample data into k parts, traversing a preset value range of penalty parameters, using k-1 parts of data to train a linear regression cost function, and using the remaining 1 parts of data to evaluate the prediction effect of the linear regression cost function to obtainPenalty parameter lambda when linear regression cost function prediction effect is optimal₂。

As a possible embodiment, the first processing module further includes: and the fifth processing submodule is used for substituting the test data in the sample data into the linear regression cost function and solving the feature selection weight in the linear regression cost function by adopting an iterative least square method.

Fig. 6 schematically shows a block diagram of a user satisfaction prediction apparatus according to an embodiment of the present disclosure.

As shown in fig. 6, theapparatus 600 includes a second obtainingmodule 601, afourth processing module 602, and afifth processing module 603.

Specifically, the second obtainingmodule 601 is configured to obtain interactive data formed based on a bank service work order; afourth processing module 602, configured to collect a feature information value of a target associated feature in the interaction data; afifth processing module 603, configured to input the feature information value into the user satisfaction prediction model obtained in the foregoing embodiment, so as to obtain a user satisfaction prediction value.

It should be noted that, in the embodiments of the present disclosure, the implementation of the apparatus portion is the same as or similar to the implementation of the method portion, and is not described herein again.

Any of the modules according to embodiments of the present disclosure, or at least part of the functionality of any of them, may be implemented in one module. Any one or more of the modules according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules according to the embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging the circuit, or in any one of three implementations, or in any suitable combination of any of the software, hardware, and firmware. Or one or more of the modules according to embodiments of the disclosure, may be implemented at least partly as computer program modules which, when executed, may perform corresponding functions.

For example, any number of the first obtainingmodule 501, thefirst processing module 502, thesecond processing module 503 and thethird processing module 504, or the second obtainingmodule 601, thefourth processing module 602 and thefifth processing module 603 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first acquiringmodule 501, thefirst processing module 502, thesecond processing module 503 and thethird processing module 504, or the second acquiringmodule 601, thefourth processing module 602 and thefifth processing module 603 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging circuits, or implemented by any one of or a suitable combination of software, hardware and firmware implementations. Thefirst acquisition module 501, thefirst processing module 502, thesecond processing module 503 and thethird processing module 504, or at least one of thesecond acquisition module 601, thefourth processing module 602 and thefifth processing module 603 may be at least partially implemented as a computer program module which, when executed, may perform a corresponding function.

Fig. 7 schematically shows a block diagram of anelectronic device 700 suitable for implementing a processing method and a processing arrangement according to embodiments of the disclosure. Theelectronic device 700 shown in fig. 7 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, anelectronic device 700 according to an embodiment of the present disclosure includes aprocessor 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from astorage section 708 into a Random Access Memory (RAM) 703. Theprocessor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. Theprocessor 701 may also include on-board memory for caching purposes. Theprocessor 701 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In theRAM 703, various programs and data necessary for the operation of theelectronic apparatus 700 are stored. Theprocessor 701, theROM 702, and theRAM 703 are connected to each other by abus 704. Theprocessor 701 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in theROM 702 and/or theRAM 703. Note that the programs may also be stored in one or more memories other than theROM 702 andRAM 703. Theprocessor 701 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 700 may also include input/output (I/O)interface 705, which input/output (I/O)interface 705 is also connected tobus 704, according to an embodiment of the present disclosure. Theelectronic device 700 may also include one or more of the following components connected to the I/O interface 705: aninput portion 706 including a keyboard, a mouse, and the like; anoutput section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; astorage section 708 including a hard disk and the like; and acommunication section 709 including a network interface card such as a LAN card, a modem, or the like. Thecommunication section 709 performs communication processing via a network such as the internet. Adrive 710 is also connected to the I/O interface 705 as needed. Aremovable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on thedrive 710 as necessary, so that a computer program read out therefrom is mounted into thestorage section 708 as necessary.

According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through thecommunication section 709, and/or installed from theremovable medium 711. The computer program, when executed by theprocessor 701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include theROM 702 and/or theRAM 703 and/or one or more memories other than theROM 702 and theRAM 703 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product runs on an electronic device, the program code is configured to enable the electronic device to implement the method for detecting a file upload vulnerability provided by the embodiments of the present disclosure.

The computer program, when executed by theprocessor 701, performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via thecommunication section 709, and/or installed from theremovable medium 711. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A user satisfaction prediction model determination method, comprising:

obtaining sample data formed based on a historical service work order, wherein the sample data comprises a user satisfaction degree label and at least one candidate associated feature influencing the user satisfaction degree;

constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight for each candidate associated feature;

screening a feature set with feature selection weights meeting preset conditions from the at least one candidate associated feature to obtain a target associated feature which is obviously related to the user satisfaction degree;

and taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics, and combining the characteristic selection weights aiming at the target associated characteristics to construct a user satisfaction prediction model.

2. The method of claim 1, wherein constructing a linear regression cost function from the user satisfaction tag and the feature information value of each of the candidate associated features to determine a feature selection weight for each of the candidate associated features comprises:

taking the user satisfaction in the sample data as a prediction target, taking the at least one candidate associated feature as an input feature, and constructing a linear fitting error function;

calculating the similarity between the input features of different sample data;

constructing a first regularization term to define a feature selection weight for the input features such that the feature selection weight is inversely proportional to the similarity;

and constructing the linear regression cost function according to the linear fitting error function and the first regular term.

3. The method of claim 2, wherein said constructing a linear fit error function using user satisfaction in said sample data as a prediction objective and said at least one candidate associated feature as an input feature comprises:

constructing the linear fit error function using the following equation:

is a bit of the ith sampleAnd selecting a weight vector, wherein n is the total number of the candidate associated features.

4. The method of claim 2, comprising: constructing the first regularization term using the following equation:

J₁＝∑_(i、j)∈mr_ij||w_i-w_j||₁；

5. The method of claim 4, wherein the constructing the linear regression cost function from the linear fit error function and the first regularization term comprises:

constructing the linear regression cost function using the following formula:

wherein λ is₁Is a preset weighting coefficient.

6. The method of claim 4, wherein the constructing the linear regression cost function from the linear fit error function and the first regularization term comprises:

constructing a second regularization term to narrow the number of input features;

and constructing the linear regression cost function by adopting a cross validation method according to the linear fitting error function, the first regular term and the second regular term.

7. The method of claim 6, wherein the second regularization term is constructed using the formula:

8. The method of claim 6, wherein constructing the linear regression cost function using a cross-validation method according to the linear fit error function, the first regularization term, and the second regularization term comprises:

constructing the linear regression cost function using the following formula:

9. The method of claim 1, wherein said determining a feature selection weight for each of said candidate associated features comprises:

and substituting the test data in the sample data into the linear regression cost function, and solving the feature selection weight in the linear regression cost function by adopting an iterative least square method.

10. A user satisfaction prediction method, comprising:

acquiring interactive data formed based on a bank service work order;

collecting characteristic information values of target associated characteristics in the interactive data;

inputting the characteristic information value into a user satisfaction prediction model obtained by the method of any one of claims 1 to 9 to obtain a user satisfaction prediction value.

11. A user satisfaction prediction model determination apparatus comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring sample data formed based on a historical service work order, and the sample data comprises a user satisfaction degree label and at least one candidate associated feature influencing the user satisfaction degree;

the first processing module is used for constructing a linear regression cost function according to the user satisfaction degree label and the feature information value of each candidate associated feature so as to determine feature selection weight for each candidate associated feature;

the second processing module is used for screening a feature set of which the feature selection weight meets a preset condition from the at least one candidate associated feature to obtain a target associated feature obviously related to the user satisfaction degree;

and the third processing module is used for taking the user satisfaction as a prediction target, taking the target associated characteristics as independent variable characteristics, and combining the characteristic selection weights aiming at the target associated characteristics to construct a user satisfaction prediction model.

12. An electronic device, comprising:

one or more processors; and

a memory for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9 or claim 10.

13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 9 or claim 10.

14. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1 to 9 or claim 10.