Disclosure of Invention
An object of the present disclosure is to provide a data processing method, a data processing apparatus, a computer-readable storage medium, and an electronic device, which overcome, to some extent, the problem of low accuracy and efficiency in providing product discounts due to limitations and disadvantages of the related art.
According to a first aspect of the present disclosure, there is provided a data processing method comprising:
acquiring multi-dimensional feature data of a target client, and generating a feature vector of the target client according to the multi-dimensional feature data, wherein the multi-dimensional feature data comprises attribute feature data, behavior feature data and order feature data;
inputting the feature vector into a preset regression model to obtain first discount data;
calculating a price elasticity coefficient corresponding to the target customer through a price elasticity model;
and adjusting the first discount data according to the price elasticity coefficient to obtain second discount data.
In an exemplary embodiment of the present disclosure, before receiving the service request, the method further includes:
vectorizing the multi-dimensional feature data to obtain a plurality of feature vectors;
and splicing the plurality of feature vectors to generate the feature vector of the target client.
In an exemplary embodiment of the present disclosure, the multi-dimensional feature data includes at least one of discrete feature data and continuous feature data; the vectorizing processing of the multi-dimensional feature data to obtain a plurality of feature vectors includes:
counting the discrete characteristic data to obtain a plurality of first characteristic data;
encoding each first feature data to generate a plurality of first feature vectors;
and normalizing each continuous feature data to generate a plurality of second feature vectors.
In an exemplary embodiment of the present disclosure, the splicing the plurality of feature vectors to generate the feature vector of the target customer includes:
and splicing the plurality of first feature vectors and the plurality of second feature vectors to generate the feature vector of the target customer.
In an exemplary embodiment of the disclosure, the calculating, by a price elasticity model, a price elasticity coefficient corresponding to the target customer includes:
acquiring sample data, wherein the sample data is historical multidimensional characteristic data in a preset time period, and the historical multidimensional characteristic data comprises attribute characteristic data, behavior characteristic data, order characteristic data and historical discount data of a historical client;
fitting the historical multidimensional characteristic data after the characteristic data of the target order are removed to obtain a first expected value of the historical discount data;
fitting the historical multidimensional characteristic data after removing the historical discount data to obtain a second expected value of the characteristic data of the target order;
calculating to obtain a price elasticity coefficient corresponding to the historical customer according to the target order characteristic data, the historical discount data, the first expected value and the second expected value;
and determining the price elasticity coefficient corresponding to the target customer according to the price elasticity coefficient corresponding to the historical customer.
In an exemplary embodiment of the present disclosure, before the feature vector is input into a preset regression model to obtain the first discounted data, the method further includes:
acquiring sample data, wherein the sample data is historical multidimensional characteristic data in a preset time period;
respectively carrying out iterative training on the parameters of the preset regression model and the parameters of the price elastic model by using the sample data;
and when the iteration termination condition is met, finishing the training of the parameters of the preset regression model and the parameters of the price elasticity model.
In an exemplary embodiment of the disclosure, the adjusting the first discount data according to the price elasticity coefficient to obtain second discount data includes:
determining the sensitivity level of the target customer according to the price elasticity coefficient;
and correspondingly adjusting the first discount data according to the sensitivity level to obtain the second discount data.
In an exemplary embodiment of the present disclosure, the determining the sensitivity level of the target customer according to the price elasticity coefficient includes:
when the price elasticity coefficient is larger than a first elasticity coefficient threshold value, determining that the target customer is a high-sensitivity customer;
when the price elasticity coefficient is larger than or equal to a second elasticity coefficient threshold value and is smaller than or equal to the first elasticity coefficient threshold value, determining that the target customer is a sensitive customer;
when the price elasticity coefficient is smaller than the second elasticity coefficient threshold value, determining that the target customer is a low-sensitivity customer.
In an exemplary embodiment of the present disclosure, the correspondingly adjusting the first discount data according to the sensitivity level to obtain the second discount data includes:
calculating target discount data increment according to the first discount data, the price elasticity coefficient and the target order characteristic data;
when the target client is a high-sensitivity client, reducing the target discount data increment on the basis of the first discount data to obtain second discount data;
when the target client is a low-sensitivity client, increasing the target discount data increment on the basis of the first discount data to obtain second discount data;
and when the target customer is a sensitive customer, taking the first discount data as the second discount data.
In an exemplary embodiment of the present disclosure, the preset regression model is a random forest model.
According to a second aspect of the present disclosure, there is provided a data processing apparatus comprising:
the multi-dimensional characteristic data acquisition module is used for acquiring multi-dimensional characteristic data of a target client and generating a characteristic vector of the target client according to the multi-dimensional characteristic data, wherein the multi-dimensional characteristic data comprises attribute characteristic data, behavior characteristic data and order characteristic data;
the first discount data determination module is used for inputting the characteristic vector into a preset regression model to obtain first discount data;
the price elasticity coefficient determining module is used for calculating the price elasticity coefficient corresponding to the target client through a price elasticity model;
and the second discount data determining module is used for adjusting the first discount data according to the price elasticity coefficient to obtain second discount data.
According to a third aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above.
According to a fourth aspect of the present disclosure, there is provided an electronic apparatus comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the method of any one of the above via execution of the executable instructions.
Exemplary embodiments of the present disclosure may have some or all of the following benefits:
in the data processing method provided by the exemplary embodiment of the present disclosure, multi-dimensional feature data of a target client is obtained, and a feature vector of the target client is generated according to the multi-dimensional feature data; inputting the characteristic vector into a preset regression model, and fitting to obtain first discount data; calculating a price elasticity coefficient corresponding to the target customer through a price elasticity model based on the first discount data and the order characteristic data; and adjusting the first discount data according to the price elastic coefficient to obtain second discount data. According to the method, the predicted discount is obtained through regression model fitting on the basis that the price elasticity coefficient is used for representing the customer sensitivity level, the predicted discount is further adjusted according to the customer sensitivity level, and the final discount suitable for different customers can be accurately and efficiently given.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 1 is a schematic diagram illustrating an exemplary system architecture to which a data processing method and apparatus according to an embodiment of the present disclosure may be applied.
As shown in fig. 1, thesystem architecture 100 may include one or more ofterminal devices 101, 102, 103, anetwork 104, and aserver 105. Thenetwork 104 serves as a medium for providing communication links between theterminal devices 101, 102, 103 and theserver 105.Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. Theterminal devices 101, 102, 103 may be various electronic devices having a display screen, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example,server 105 may be a server cluster comprised of multiple servers, or the like.
The data processing method provided by the embodiment of the disclosure may be executed by theserver 105, and accordingly, the data processing apparatus may be disposed in theserver 105, and the server may send the first discount data, the price elastic coefficient corresponding to the target client, the second discount data, and the like to the terminal device, and the terminal device may display the data to the staff at the logistics sales side. However, it is easily understood by those skilled in the art that the data processing method provided in the embodiment of the present disclosure may also be executed by theterminal devices 101, 102, and 103, and correspondingly, the data processing apparatus may also be disposed in theterminal devices 101, 102, and 103, for example, after being executed by the terminal devices, the first discount data, the price elasticity coefficient corresponding to the target customer, and the second discount data may be directly displayed on the display screen of the terminal device for being displayed by the staff at the logistics sales side, which is not limited in this exemplary embodiment.
The technical solution of the embodiment of the present disclosure is explained in detail below:
in the exemplary embodiment of the present disclosure, a scenario in which a logistics sales side signs service contracts with a large number of customers is taken as an example for explanation. At present, in the signing process, when a logistics sale side prices products, price discounts are usually given according to experience or a large number of rules, so that the subjectivity is strong, and data guidance is lacked. For example, a given price discount that is too low may result in a reduced revenue. Moreover, the entire process of giving a price discount manually takes a long time.
Based on one or more of the problems described above, the present exemplary embodiment provides a data processing method. Referring to fig. 2, the data processing method may include the following steps S210 to S230:
s210, obtaining multi-dimensional feature data of a target client, and generating a feature vector of the target client according to the multi-dimensional feature data, wherein the multi-dimensional feature data comprises attribute feature data, behavior feature data and order feature data;
s220, inputting the feature vectors into a preset regression model, and fitting to obtain first discount data;
step S230, calculating a price elasticity coefficient corresponding to the target customer through a price elasticity model based on the first discount data and the order characteristic data;
and S240, adjusting the first buckling data according to the price elasticity coefficient to obtain second buckling data.
In the data processing method provided by the exemplary embodiment of the present disclosure, multi-dimensional feature data of a target client is obtained, and a feature vector of the target client is generated according to the multi-dimensional feature data; inputting the characteristic vector into a preset regression model, and fitting to obtain first discount data; calculating a price elasticity coefficient corresponding to the target customer through a price elasticity model based on the first discount data and the order characteristic data; and adjusting the first discount data according to the price elastic coefficient to obtain second discount data. According to the method, the predicted discount is obtained through the fitting of the regression model on the basis that the price elasticity coefficient is used for representing the customer sensitivity level, the predicted discount is further adjusted according to the customer sensitivity level, and the final discount suitable for different customers can be accurately and efficiently given.
The above steps of the present exemplary embodiment will be described in more detail below.
In step S210, multi-dimensional feature data of a target customer is obtained, and a feature vector of the target customer is generated according to the multi-dimensional feature data, where the multi-dimensional feature data includes attribute feature data, behavior feature data, and order feature data.
In a scenario where a logistics sales side signs service contracts with a large number of clients, a target client is a client to be signed. For example, the service contract may be due, an old customer who needs to renew the service contract, or a new customer who needs to sign the service contract. In an example embodiment of the present disclosure, the multidimensional feature data of the target customer may include feature data of multiple dimensions, such as attribute feature data, behavior feature data, and order feature data. The attribute feature data of the target user can be obtained from the feature basic attribute table, and the attribute feature data can include three-level industry information, a client level and the like of a product signed by the target client. For example, the third-level industry information of the logistics product is android mobile phones, the client level is first level, and the client credit degree is better. If the target client is an old client, the behavior characteristic data of the client can be obtained from the historical reimbursement and claim settlement records of the client. If the target client is a new client, the behavior characteristic data of the client does not need to be acquired. The behavior characteristic data can comprise client money return data (whether the money can be returned in time), claim settlement data (whether a claim is settled or not), historical invoice amount (such as invoice amount in the last half year) and the like. For a new customer, order characteristic data can be obtained in the process of signing a service contract. The order characteristic data can be the content filled in the contract by the client, such as order amount, income commitment, contract month, contract area, average weight, contract type, sale area, initial position of logistics and the like. For the old customer, historical order characteristic data can be obtained, and historical discount data after successful sign-on is also included in the historical order characteristic data.
After the multi-dimensional feature data of the target client is obtained, vectorization processing can be performed on the feature data of each dimension to obtain a plurality of feature vectors, and the feature vectors are spliced to generate the feature vector of the target client. The multi-dimensional feature data may include at least one of discrete feature data and continuous feature data. The order quantity, average weight, promised income and the like are continuous characteristic data, and the three-level industry information, the starting place of logistics and the like are discrete characteristic data. For example, each discrete feature data may be counted to obtain a plurality of first feature data, and each first feature data is encoded to generate a plurality of first feature vectors. And carrying out normalization processing on each continuous feature data to generate a plurality of second feature vectors.
In an exemplary embodiment, for discrete feature data such as three-level industry information, chinese word segmentation can be performed on the three-level industry information, and segmentation is performed into range attributes with finer granularity, so as to form new features. The word segmentation can be performed based on a dictionary, can also be performed based on statistics, and can also be performed based on rules. For example, the discrete feature values may be counted, the frequency of occurrence of each feature data may be counted as first feature data, and the first feature data may be encoded to obtain a first feature vector. For example, the starting point of the material flow comprises a plurality of characteristic words which can be coded as multi-hot characteristics, and the rest can be coded as one-hot characteristics. For the continuous feature data, normalization processing may be performed on the continuous feature data to generate a second feature vector. The normalized continuous features can convert features of different magnitudes into features of the same magnitude, so that the influence of different magnitudes on the model is avoided. For example, each dimension feature can be linearly mapped to a target range, and if mapped to [0,1] or [ -1,1], it can also be normalized by standard deviation, which is not limited by this disclosure.
In an example embodiment of the present disclosure, a pipeline mechanism may be utilized to serially connect vectorization processes of feature data of each dimension, so as to implement concatenation of a plurality of first feature vectors and a plurality of second feature vectors, thereby generating a feature vector of a target client. It can be seen that the feature vector of the target client is composed of normalized continuous feature data and encoded discrete features. The Pipeline mechanism is a batch processing technology, and can improve data processing efficiency.
In step S220, the feature vector is input into a preset regression model, and a first discount data is obtained through fitting.
In an exemplary embodiment, the preset regression model may be a random forest model, the random forest is a classifier that trains and predicts sample data using a plurality of trees, a final result is obtained by combining a plurality of weak classifiers and voting or averaging, and a result of the overall model has high accuracy and generalization performance. After the feature vector of the target customer is obtained, the feature vector can be input into a random forest model to obtain first discount data, namely, the predicted discount. For example, the output value range of the first discount data may be [0, 100], for example, when the first discount data is 10, the predicted discount is 1. In other examples, the preset regression model may also be a linear regression model, a logistic regression model, etc., which is not limited in this disclosure.
In the example, the random forest model is an expert system, and can well learn the experience and rules of logistics sales test, so that the predicted discount of the target customer is accurately output, and the accuracy of data processing is improved. Moreover, the signing discount is obtained through the regression model, so that subjective factors of the logistics sales side can be avoided, and the accuracy of data processing is further improved.
In step S230, based on the first discount data and the order feature data, a price elasticity coefficient corresponding to the target customer is calculated through a price elasticity model.
In an example embodiment of the present disclosure, the price elasticity model may be a DML (Double Machine Learning) model. The high-dimensional data can be modeled by using a DML model, and the aim of variable rule selection can be achieved by self-contained regularization. For example, referring to fig. 3, the price elasticity coefficient corresponding to the target customer may be obtained through DML model fitting according to steps S310 to S330.
In step S310, sample data is obtained, where the sample data is historical multidimensional feature data in a preset time period, and the historical multidimensional feature data includes attribute feature data, behavior feature data, order feature data, and historical discount data of a historical customer.
For example, the sample data may be historical multidimensional feature data of successful sign-ups in the last 2 years, including attribute feature data, behavior feature data, order feature data, and historical discount data of historical sign-ups. For example, data such as customer grade of a historical contracted customer, customer refund data, historical sheet amount, contracted month, contracted area, final discount of the contract, and the like are acquired. Data cleaning may be performed on the sample data. For example, noisy data other than 3 σ may be removed, and noisy samples with large deviations of the commitment singles from the actual issue singles may exist.
In the DML model, the sample data is generally divided into two parts, i.e. sample data a and sample data B. The discount data and the target order characteristic data (such as the issue quantity) can be respectively fitted by using sample data A, and the price elastic coefficient of the discount data and the target order characteristic data can be fitted by using sample data B. The discount data can be fitted by using the sample data A, the invoice amount can be fitted by using the sample data B, and the price elasticity coefficient can be calculated according to the fitting result, which is not limited by the disclosure.
In step S320, the historical multidimensional feature data after the target order feature data is removed is fitted to obtain a first expected value of the historical discount data.
In an example embodiment of the present disclosure, the target order characteristic data is an order amount, and the characteristic data other than the order amount and discount data is a miscellaneous variable. For example, the issued orders in the sample data a may be removed, the historical discount data T may be fitted by using a random forest model, that is, the discount data may be predicted by using the confounding variable, and the first expected value E (T) of the historical discount data T may be obtained.
In step S330, fitting the historical multidimensional feature data without the historical discount data to obtain a second expected value of the target order feature data.
Similarly, the historical discount data in the sample data a may be removed, and a random forest model is used to fit the invoice amount Y, that is, the invoice amount is predicted by using the confounding variable, and a second expected value E (Y) of the invoice amount Y is obtained.
In step S340, a price elasticity coefficient corresponding to the historical customer is calculated according to the target order feature data, the historical discount data, the first expected value, and the second expected value.
After obtaining the first expected value E (T) of the historical discount data T and the second expected value E (Y) of the issue amount Y, the historical discount data in the sample B and the target order feature data may be used to calculate a residual error, that is:
historical discount data T and invoice amount Y which are not influenced by confounding variables are obtained, namely
Wherein T is historical discount data, Y is target order characteristic data, namely the invoice amount, E (T) is a first expected value of the historical discount data T, and E (Y) is a second expected value of the invoice amount Y.
Then, by pair
And performing log-log regression, and calculating to obtain a price elasticity coefficient theta corresponding to the historical client.
From equations (1) and (2), equation (3) can be written again as:
log[Y-E(Y)]~θlog[T-E(T)]+∈ (4)
wherein,
representing the intercept in the regression equation.
It is understood that, according to step S340, a price elasticity coefficient corresponding to each historical client can be obtained.
In step S350, the price elasticity coefficient corresponding to the target customer is determined according to the price elasticity coefficient corresponding to the historical customer.
After the price elasticity coefficient corresponding to each historical client is obtained, if the target client is an old client needing to renew the service contract, the price elasticity coefficient of the client can be directly obtained. If the target client is a new client needing to sign a business contract, the similarity between the feature vector of the target client and the feature vector of each historical client can be calculated, the historical client similar to the target client is determined, and the price elasticity coefficient of the historical client is used as the price elasticity coefficient of the target client.
In an example embodiment, before the feature vector of the target customer is input into the preset regression model and the first discount data is obtained, the preset regression model and the price elasticity model may be trained in advance. For example, sample data may be obtained, where the sample data is historical multidimensional feature data within a preset time period, for example, historical multidimensional feature data within nearly 2 years is obtained, and data cleaning is performed. For example, noisy data other than 3 σ may be discarded, and noisy samples with large deviations of the commitment amount from the actual commitment amount may exist. And carrying out iterative training on the parameters of the preset regression model and the parameters of the price elastic model respectively by using the cleaned sample data, and finishing the training on the parameters of the preset regression model and the parameters of the price elastic model when the iteration termination condition is met.
Specifically, the sample data after cleaning may be randomly divided into training sample data and test sample data, for example, the sample data may be divided into 8:2, dividing the sample data after cleaning into training sample data and test sample data. The training sample data is used for training the model to improve the performance of the model, and the test sample data is used for evaluating the performance of the model. When the parameters of the preset regression model are optimized, taking the random forest model as an example, the parameters of the random forest model may include the number of trees, the depth, the number of leaves, the minimum sample number of split leaves, and the like of the decision tree. When the parameters of the price elasticity model are optimized, taking the DML model as an example, the parameters of the DML model may include a learning rate, a regular parameter, an iteration number, and the like. And when the parameters of the random forest model and the parameters of the DML model are respectively trained, the training is terminated when all the parameters tend to converge or certain iteration times are met.
Finally, the model performance of each model may be evaluated using the test sample data. For example, R can be used2 The index and MAE (Mean Absolute Error) index were evaluated. For R2 Index, R2 ≤1,R2 A larger value of (A) indicates a better model performance. For the MAE index, the smaller the value of MAE, the better the model performance.
In the example, the price elastic model is used for elastically analyzing the customers, so that the sensitivity of the customers to the price can be accurately analyzed, and the final discount of each customer can be flexibly and accurately determined.
In step S240, the first discount data is adjusted according to the price elastic coefficient to obtain second discount data.
In example embodiments of the present disclosure, a mapping relationship between price elasticity coefficients and customer sensitivity levels may be established. Illustratively, the customers can be classified into 3 sensitivity levels, namely high-sensitivity customers, medium-sensitivity customers and low-sensitivity customers according to the price elasticity coefficient theta. When the price elasticity coefficient is larger than the first elasticity coefficient threshold value, the client is a high-sensitivity client; when the price elastic coefficient is larger than or equal to the second elastic coefficient threshold value and is smaller than or equal to the first elastic coefficient threshold value, the client is a sensitive client; when the price elastic coefficient is less than the second elastic coefficient threshold value, the client is a low-sensitive client. The specific values of the two elastic coefficient thresholds are not limited in this disclosure.
For example, the first elastic modulus threshold value may be 1, and the second elastic modulus threshold value may be 0.8. And determining the sensitivity level of the target client according to the price elasticity coefficient. When the price elastic coefficient theta is larger than 1, the target client can be determined to be a high-sensitivity client; when the price elastic coefficient is more than or equal to 0.8 and less than or equal to 1, determining the client as a sensitive client; when the price elastic coefficient theta is less than 0.8, the customer can be determined to be a low-sensitive customer.
After the sensitivity level of the target client is determined, the given first discount data can be correspondingly adjusted according to the sensitivity level of the target client, and second discount data is obtained. It can be understood that a high-sensitive customer is sensitive to price and can properly reduce the price, a low-sensitive customer is not sensitive to price and can properly increase the price, and for a medium-sensitive customer, the price can not be adjusted.
Illustratively, the target discount data increment may be calculated based on the first discount data, the price elasticity coefficient, and the order characteristic data. Specifically, the optimal solution of equation (5) may be calculated to obtain the target discount data increment, i.e. when Δ p is the target discount data increment, the maximum profit is obtained.
Wherein,
the discount data is the first discount data, q is the promised invoice amount when the client signs a contract, Δ p is the discount data increment, such as 1-fold, 2-fold, etc., θ is the price elastic coefficient corresponding to the target client, and R is the profit, i.e. the product of the unit price and the invoice amount. The theta index may amplify the revenue to facilitate determining a corresponding target discount data increment when the revenue is maximized.
After the target discount data increment is determined, the given first discount data can be adjusted correspondingly according to the sensitivity level of the target client. When the target client is a high-sensitivity client, the target discount data increment can be reduced on the basis of the first discount data to obtain second discount data. When the target client is a low-sensitivity client, the target discount data increment can be added on the basis of the first discount data to obtain second discount data; when the target customer is a sensitive customer, the first discount data can be used as the second discount data. For example, the first discount data is 7-fold, the target discount data increment is 1-fold, if the target customer is a high-sensitivity customer, a 6-fold discount can be given, and the order amount can be increased by reducing the price, so that the income is improved. If the target customer is a sensitive customer, the final discount is a discount of 7. If the target customer is a low-sensitive customer, the final discount is given as 8-fold, and the profit is further increased by increasing the price.
In the example, the price sensitivity of the customer is quantified, the pre-estimated discount given by the preset regression model is flexibly adjusted according to the price sensitivity of the customer, the final discount suitable for different customers is given, and the total income of the logistics sales side is greatly improved.
Referring to fig. 4, the contracted discount of the customer may be determined according to steps S401 to S404.
Step S401, obtaining order characteristic data. When a logistics sales side signs a contract with a client, collecting order characteristic data from a signing contract;
s402, predicting a signed discount by using a random forest model;
s403, predicting customer sensitivity by using a DML model;
and S404, adjusting the predicted discount according to the customer sensitivity, and signing by using the final discount.
In this example, the order of step S402 and step S403 is not limited. The contract discount is properly adjusted based on the client sensitivity degree, so that the income can be improved, and the human resources for subjectively giving the contract discount by the logistics sales side can be effectively saved.
In the data processing method provided by the exemplary embodiment of the present disclosure, the multidimensional feature data of a target client is obtained, and a feature vector of the target client is generated according to the multidimensional feature data; inputting the characteristic vector into a preset regression model, and fitting to obtain first discount data; calculating a price elasticity coefficient corresponding to the target customer through a price elasticity model based on the first discount data and the order characteristic data; and adjusting the first discount data according to the price elasticity coefficient to obtain second discount data. According to the method, the predicted discount is obtained through regression model fitting on the basis that the price elasticity coefficient is used for representing the customer sensitivity level, the predicted discount is further adjusted according to the customer sensitivity level, and the final discount suitable for different customers can be accurately and efficiently given.
It should be noted that although the various steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken into multiple step executions, etc.
Further, in the present exemplary embodiment, a data processing apparatus is also provided, and the apparatus may be applied to a server or a terminal device. Referring to fig. 5, thedata processing apparatus 500 may include a multidimensional featuredata acquisition module 510, a first discountdata determination module 520, a price elasticitycoefficient determination module 530, and a second discountdata determination module 540, wherein:
a multidimensional featuredata obtaining module 510, configured to obtain multidimensional feature data of a target client, and generate a feature vector of the target client according to the multidimensional feature data, where the multidimensional feature data includes attribute feature data, behavior feature data, and order feature data;
a first discountdata determination module 520, configured to input the feature vector into a preset regression model to obtain first discount data;
a price elasticitycoefficient determining module 530, configured to calculate a price elasticity coefficient corresponding to the target customer through a price elasticity model;
and a second discountdata determining module 540, configured to adjust the first discount data according to the price elastic coefficient, so as to obtain second discount data.
In an alternative embodiment, the multi-dimensional featuredata obtaining module 510 includes:
the characteristic vector generation submodule is used for vectorizing the multi-dimensional characteristic data to obtain a plurality of characteristic vectors;
and the feature vector splicing submodule is used for splicing the plurality of feature vectors to generate the feature vector of the target client.
In an alternative embodiment, the multi-dimensional feature data includes at least one of discrete feature data and continuous feature data; the feature vector generation submodule comprises:
the first characteristic data processing unit is used for counting the discrete characteristic data to obtain a plurality of first characteristic data;
a first feature vector generation unit configured to encode each of the first feature data and generate a plurality of first feature vectors;
and a second feature vector generation unit configured to normalize each of the continuous feature data and generate a plurality of second feature vectors.
In an optional embodiment, the feature vector stitching module is configured to stitch the plurality of first feature vectors and the plurality of second feature vectors to generate the feature vector of the target customer.
In an alternative embodiment, the price elasticcoefficient determining module 530 includes:
the sample data acquisition submodule is used for acquiring sample data, wherein the sample data is historical multidimensional characteristic data in a preset time period, and the historical multidimensional characteristic data comprises attribute characteristic data, behavior characteristic data, order characteristic data and historical discount data of a historical client;
the first data fitting submodule is used for fitting the historical multidimensional characteristic data after the characteristic data of the target order are removed to obtain a first expected value of the historical discount data;
the second data fitting submodule is used for fitting the historical multidimensional characteristic data after historical discount data are removed to obtain a second expected value of the target order characteristic data;
a first coefficient calculation submodule, configured to calculate a price elasticity coefficient corresponding to the historical customer according to the target order feature data, the historical discount data, the first expected value, and the second expected value;
and the second coefficient calculation submodule is used for determining the price elasticity coefficient corresponding to the target customer according to the price elasticity coefficient corresponding to the historical customer.
In an alternative embodiment, thedata processing apparatus 500 further comprises:
the model training module is configured to acquire sample data, wherein the sample data is historical multidimensional characteristic data in a preset time period; respectively carrying out iterative training on the parameters of the preset regression model and the parameters of the price elastic model by using the sample data; and when the iteration termination condition is met, finishing the training of the parameters of the preset regression model and the parameters of the price elasticity model.
In an alternative embodiment, the second discountdata determination module 540 includes:
the client sensitivity level determining submodule is used for determining the sensitivity level of the target client according to the price elasticity coefficient;
and the second discount data determining submodule is used for correspondingly adjusting the first discount data according to the sensitivity level to obtain the second discount data.
In an alternative embodiment, the customer sensitivity level determination sub-module includes:
the first sensitivity level determining unit is used for determining that the target customer is a high-sensitivity customer when the price elasticity coefficient is larger than a first elasticity coefficient threshold value;
the second sensitivity level determining unit is used for determining that the target customer is a sensitive customer when the price elasticity coefficient is larger than or equal to a second elasticity coefficient threshold value and is smaller than or equal to the first elasticity coefficient threshold value;
and the third sensitivity level determining unit is used for determining that the target customer is a low-sensitivity customer when the price elasticity coefficient is smaller than the second elasticity coefficient threshold value.
In an alternative embodiment, the second discount data determination sub-module includes:
the discount data increment calculating unit is used for calculating target discount data increments according to the first discount data, the price elasticity coefficient and the target order characteristic data;
the second discount data calculation unit is used for reducing the target discount data increment on the basis of the first discount data to obtain second discount data when the target customer is a high-sensitivity customer; when the target client is a low-sensitivity client, the target discount data increment is increased on the basis of the first discount data to obtain second discount data; and when the target customer is a sensitive customer, taking the first discount data as the second discount data.
In an alternative embodiment, the predetermined regression model in thedata processing apparatus 500 is a random forest model.
The specific details of each module in the data processing apparatus have been described in detail in the corresponding data processing method, and therefore are not described herein again.
Each module in the above apparatus may be a general-purpose processor, including: a central processing unit, a network processor, etc.; but also be a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The modules may also be implemented in software, firmware, etc. The processors in the above device may be independent processors or may be integrated together.
Exemplary embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing an electronic device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the electronic device. The program product may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on an electronic device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Exemplary embodiments of the present disclosure also provide an electronic device capable of implementing the above method. Anelectronic device 600 according to this exemplary embodiment of the present disclosure is described below with reference to fig. 6. Theelectronic device 600 shown in fig. 6 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6,electronic device 600 may take the form of a general purpose computing device. The components of theelectronic device 600 may include, but are not limited to: at least oneprocessing unit 610, at least onememory unit 620, abus 630 that couples various system components including thememory unit 620 and theprocessing unit 610, and adisplay unit 640.
Thestorage unit 620 stores program code, which may be executed by theprocessing unit 610, to cause theprocessing unit 610 to perform the steps according to various exemplary embodiments of the present disclosure described in the above-mentioned "exemplary method" section of this specification. For example, processingunit 610 may perform any one or more of the method steps of fig. 2-4.
Thestorage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM) 621 and/or acache memory unit 622, and may further include a read only memory unit (ROM) 623.
Thestorage unit 620 may also include a program/utility 624 having a set (at least one) ofprogram modules 625,such program modules 625 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
Theelectronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with theelectronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable theelectronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O)interface 650. Also, theelectronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via thenetwork adapter 660. As shown, thenetwork adapter 660 communicates with the other modules of theelectronic device 600 over thebus 630. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with theelectronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some embodiments, the data processing methods described in the present disclosure may be performed by theprocessing unit 610 of the electronic device. In some embodiments, the rule model may be configured throughinput interface 650. For example, the corresponding model is configured according to a preset model data format through a model management interface provided by the electronic device, and the corresponding script is configured according to a script data format through a script configuration interface provided by the electronic device. In some embodiments, the results of the execution of the business process may be output to the external device 900 via theoutput interface 650 for viewing by the user.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, and may also be implemented by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the exemplary embodiments of the present disclosure.
Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes illustrated in the above figures are not intended to indicate or limit the temporal order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.