Disclosure of Invention
The embodiment of the application aims to provide a model optimization method for enhancing fairness, which can improve the fairness of a model by adding an optimization unit into the model after model training is finished.
The first aspect of the present application provides a model optimization method for enhancing fairness, comprising:
The method comprises the steps of obtaining a training sample data set and a test sample data set, wherein the training sample data set comprises a discriminated training sample set and a normal training sample set;
performing preliminary training of a two-class machine learning model according to the training sample data set to obtain a first model;
an optimization unit is built in the first model to obtain a second model with a modified model structure;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model;
Performing cyclic optimization on the third model according to the training sample data set to obtain an optimal optimization unit parameter value;
Setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model;
performing fine tuning optimization on the fourth model according to the test sample data set to obtain fine tuning optimization unit parameter values;
and setting the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values to obtain a target model for completing optimization.
In the implementation process, the method can perform preliminary training of the classification machine learning model based on the samples of the discriminated population and the samples of the normal population, so that the discrimination behaviors of the sample population are initially existed, and early preparation is made for optimizing the discrimination behaviors of the sample population by a follow-up adding optimizing unit.
Further, the training cut-off condition of the first model is that a preset cross entropy loss function converges or a training cycle exceeds a preset iteration threshold;
the first model comprises a model front part and a sigmoid unit;
in the second model, the optimization unit includes a first neuron and a second neuron at a first layer, and a third neuron at a second layer;
the inputs of the first neuron and the second neuron are population attribute labels of training samples in the training sample dataset;
the input of the front part of the model is a training sample in the training sample data set;
the output of the model front, the output of the first neuron and the output of the second neuron are respectively connected with the input of the third neuron;
the output of the third neuron is connected with the input of the sigmoid unit;
the output of the sigmoid unit is the output of the second model;
The optimization unit parameters of the optimization unit include a first weight parameter of the first neuron, a first bias parameter of the first neuron, a second weight parameter of the second neuron, a second bias parameter of the second neuron, a third weight parameter of the third neuron, and a third bias parameter of the third neuron.
In the implementation process, the method disassembles the two classification machine learning models (namely the first model), and adds an optimizing unit with a specific structure at a specific position, so that the second model with the modified model structure has the basic capability of overcoming the discrimination behavior of the sample group.
Further, the output of the third neuron in the second model is:
o3=(w31w1+w32w2)ci+w33fi+w31b1+w32b2+b3;
Wherein o3 is the output of the third neuron, w1 is the first weight parameter, b1 is the first bias parameter, w2 is the second weight parameter, b2 is the second bias parameter, w31 and w32 are both the third weight parameter, b3 is the third bias parameter, fi is the output of the model front for the training sample, ci is the population attribute tag of the training sample;
the preset optimizing unit parameter values are specifically as follows:
w31=w33=1;
w32=-1,b2=b3=0,b1=w2;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model, wherein the output of the third neuron in the third model is as follows:
o3=(w1-w2)ci+fi+w2。
in the implementation process, the method specifically limits the optimizing unit and proposes an application simplification scheme, and the method can enable the calculation structure of the optimizing unit to be determined so as to be ready for subsequent parameter adjustment.
Further, the performing the loop optimization on the third model according to the training sample data set to obtain an optimal optimization unit parameter value, including:
Performing model initial optimization on the third model according to the training sample data set to obtain initial optimization unit parameter values;
And performing model circulation optimization on the third model according to the training sample data set and the initial optimization unit parameter value to obtain an optimal optimization unit parameter value.
In the implementation process, the method divides the parameter determination process of the optimization unit into two parts, which can respectively play roles of model initial optimization and model cyclic optimization, so that the parameter determination effect of the optimization unit is ensured by the process, and the problem of sample discrimination of the primary model is solved.
Further, the performing model initial optimization on the third model according to the training sample data set to obtain an initial optimization unit parameter value, including:
setting the value of the second weight parameter to 0 and keeping the value unchanged;
Continuously increasing the value of the first weight parameter from 0 according to a first preset value increasing algorithm until the difference between the first positive finding rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set is within a first preset range, determining the value of the first weight parameter as the value of the first weight parameter, and calculating the first accuracy rate of the third model on the training sample data set;
Determining the first accuracy as a historical maximum accuracy;
And determining the values of the first weight parameter and the second weight parameter as initial optimization unit parameter values.
In the implementation process, the method fixes the second weight parameter to 0, preferentially acquires the optimal value of the first weight parameter, and records the corresponding accuracy, so that the high-quality initialization of the value of the optimizing unit is realized, and the mat is prepared for the subsequent parameter value optimization.
Further, the performing model loop optimization on the third model according to the training sample data set and the initial optimization unit parameter value to obtain an optimal optimization unit parameter value includes:
Setting the value of the first weight parameter to be 0, determining the value of the first weight parameter as the current circulation value, and keeping the value unchanged;
Continuously reducing the value of the second weight parameter from 0 according to a preset value reduction algorithm until the difference between a third Cha Yang rate of the third model on the discriminated training sample set and a fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determining the value of the second weight parameter as the value of the second weight parameter, and calculating a second accuracy rate of the third model on the training sample data set;
when the second accuracy rate is greater than the historical maximum accuracy rate, determining the second accuracy rate as the historical maximum accuracy rate;
Tentatively setting the current cycle value and the second weight parameter value as the optimal optimization unit parameter value;
judging whether the current cycle value is larger than the first weight parameter value or not;
Increasing the value of the first weight parameter according to a second preset value increasing algorithm to obtain a current circulation value, and keeping the current circulation value unchanged;
if yes, determining the tentative optimal optimization unit parameter value as the optimal optimization unit parameter value;
If not, triggering and executing the value of the second weight parameter to be continuously reduced from 0 according to a preset value reduction algorithm until the difference between the third Cha Yang rate of the third model on the discriminated training sample set and the fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determining the value of the second weight parameter as the value of the second weight parameter, and calculating the second accuracy rate of the third model on the training sample data set.
In the implementation process, the method can further select the optimal first weight parameter and the optimal second weight parameter within the limit range of the value of the first weight parameter, so that the parameters of the optimization unit fall to the ground, and the optimization model is promoted to overcome the problem of discrimination of samples.
Further, the method further comprises:
And when the second accuracy rate is not greater than the historical maximum accuracy rate, triggering and executing the second preset value increasing algorithm to increase the value of the first weight parameter to obtain the current circulating value, and keeping unchanged.
In the implementation process, the method can perform parameter space search (the search process is a process of orderly and exhaustive traversal) on two parameters included in the parameter values of the optimal optimization unit, so that the searched optimal result is determined and reserved as a final result.
The second aspect of the present application provides a fairness-enhancing model optimization apparatus, comprising:
The system comprises an acquisition unit, a judgment unit and a judgment unit, wherein the acquisition unit is used for acquiring a training sample data set and a test sample data set, the training sample data set comprises a discriminated training sample set and a normal training sample set, and the test sample data set comprises a discriminated test sample set and a normal test sample set;
the preliminary training unit is used for performing preliminary training of the two-class machine learning model according to the training sample data set to obtain a first model;
the structure modification unit is used for constructing an optimization unit in the first model to obtain a second model with a modified model structure;
The parameter setting unit is used for setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model;
The circulation optimization unit is used for carrying out circulation optimization on the third model according to the training sample data set to obtain the parameter value of the optimal optimization unit;
The parameter determining unit is used for setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model;
the fine tuning optimization unit is used for carrying out fine tuning optimization on the fourth model according to the test sample data set to obtain the parameter value of the fine tuning optimization unit;
and the model generating unit is used for setting the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values to obtain the optimized target model.
Further, the training cut-off condition of the first model is that a preset cross entropy loss function converges or a training cycle exceeds a preset iteration threshold;
the first model comprises a model front part and a sigmoid unit;
in the second model, the optimization unit includes a first neuron and a second neuron at a first layer, and a third neuron at a second layer;
the inputs of the first neuron and the second neuron are population attribute labels of training samples in the training sample dataset;
the input of the front part of the model is a training sample in the training sample data set;
the output of the model front, the output of the first neuron and the output of the second neuron are respectively connected with the input of the third neuron;
the output of the third neuron is connected with the input of the sigmoid unit;
the output of the sigmoid unit is the output of the second model;
The optimization unit parameters of the optimization unit include a first weight parameter of the first neuron, a first bias parameter of the first neuron, a second weight parameter of the second neuron, a second bias parameter of the second neuron, a third weight parameter of the third neuron, and a third bias parameter of the third neuron.
Further, the output of the third neuron in the second model is:
o3=(w31w1+w32w2)ci+w33fi+w31b1+w32b2+b3;
Wherein o3 is the output of the third neuron, w1 is the first weight parameter, b1 is the first bias parameter, w2 is the second weight parameter, b2 is the second bias parameter, w31 and w32 are both the third weight parameter, b3 is the third bias parameter, fi is the output of the model front for the training sample, ci is the population attribute tag of the training sample;
the preset optimizing unit parameter values are specifically as follows:
w31=w33=1;
w32=-1,b2=b3=0,b1=w2;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model, wherein the output of the third neuron in the third model is as follows:
o3=(w1-w2)ci+fi+w2。
further, the loop optimization unit includes:
The initial optimization subunit is used for carrying out model initial optimization on the third model according to the training sample data set to obtain an initial optimization unit parameter value;
And the cyclic optimization subunit is used for carrying out model cyclic optimization on the third model according to the training sample data set and the initial optimization unit parameter value to obtain an optimal optimization unit parameter value.
Further, the initial optimization subunit includes:
the first setting module is used for setting the value of the second weight parameter to 0 and keeping the value unchanged;
The initial optimization module is used for continuously increasing the value of the first weight parameter from 0 according to a first preset value increasing algorithm until the difference value between the first positive checking rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set is within a first preset range, determining the value of the first weight parameter as the value of the first weight parameter, and calculating the first accuracy rate of the third model on the training sample data set;
the first determining module is used for determining the first accuracy as a historical maximum accuracy;
the first determining module is further configured to determine the first weight parameter value and the second weight parameter value as an initial optimization unit parameter value.
Further, the loop optimization subunit includes:
The second setting module is used for setting the value of the first weight parameter to be 0, determining the value of the first weight parameter as the current cycle value and keeping the value unchanged;
The cyclic optimization module is used for continuously reducing the value of the second weight parameter from 0 according to a preset value reduction algorithm until the difference value between a third Cha Yang rate of the third model on the discriminated training sample set and a fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determining the value of the second weight parameter as the value of the second weight parameter, and calculating the second accuracy rate of the third model on the training sample data set;
The second determining module is used for determining the second accuracy rate as the historical maximum accuracy rate when the second accuracy rate is larger than the historical maximum accuracy rate;
the tentative module is used for tentatively setting the current cycle value and the second weight parameter value as the optimal optimization unit parameter value;
the calculation module is used for increasing the value of the first weight parameter according to a second preset value increasing algorithm to obtain a current circulation value, and the current circulation value is kept unchanged;
the judging module is used for judging whether the current cycle value is larger than the first weight parameter value or not;
the second determining module is further configured to determine, when the determination result of the determining module is yes, a tentative optimal optimization unit parameter value as an optimal optimization unit parameter value;
And the second setting module is further configured to trigger, when the determination result of the determination module is no, the loop optimization module to execute the continuous decrease of the value of the second weight parameter from 0 according to a preset value decrease algorithm until a difference between a third Cha Yang rate of the third model on the discrimination training sample set and a fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determine the value of the second weight parameter as the value of the second weight parameter, and calculate a second accuracy of the third model on the training sample data set.
Further, the judging module is further configured to trigger the calculating module to execute the step of increasing the value of the first weight parameter according to a second preset value increasing algorithm when the second accuracy is not greater than the historical maximum accuracy, so as to obtain a current cyclic value, and the current cyclic value is kept unchanged.
The third aspect of the application provides a training method of a job hunting resume screening model, which comprises the following steps:
the method comprises the steps of obtaining a training sample data set and a test sample data set, wherein the training sample data set comprises a first female recruiter resume set and a first male recruiter resume set;
performing preliminary training of a two-class machine learning model according to the training sample data set to obtain a first model;
an optimization unit is built in the first model to obtain a second model with a modified model structure;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model;
Performing cyclic optimization on the third model according to the training sample data set to obtain an optimal optimization unit parameter value;
Setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model;
performing fine tuning optimization on the fourth model according to the test sample data set to obtain fine tuning optimization unit parameter values;
And setting the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values to obtain the job hunting resume screening model for completing optimization.
The fourth aspect of the application provides a job hunting resume screening method, which comprises the following steps:
Acquiring a resume of an recruiter;
Inputting the recruiter into a job hunting resume screening model, and outputting a screening result of whether the recruiter resume meets recruitment requirements, wherein the job hunting resume screening model is obtained by the model optimization method for enhancing fairness according to any one of the first aspect of the application.
A fifth aspect of the present application provides an electronic device comprising a memory for storing a computer program and a processor for running the computer program to cause the electronic device to perform the fairness enhancing model optimization method of any one of the first aspects of the present application.
A sixth aspect of the application provides a computer readable storage medium storing computer program instructions which, when read and executed by a processor, perform the fairness-enhancing model optimization method of any one of the first aspects of the application.
A seventh aspect of the application provides a computer program product comprising a computer program which, when run by a processor, performs the fairness enhancing model optimization method of any one of the first aspects of the application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, fig. 1 is a flow chart of a model optimization method for enhancing fairness according to the present embodiment. The model optimization method for enhancing fairness comprises the following steps:
s101, acquiring a training sample data set and a test sample data set, wherein the training sample data set comprises a discriminated training sample set and a normal training sample set, and the test sample data set comprises a discriminated test sample set and a normal test sample set.
S102, performing preliminary training of the two classification machine learning models according to the training sample data set to obtain a first model.
In this embodiment, the training cutoff condition of the first model is that the preset cross entropy loss function converges or the training cycle passes a preset iteration threshold;
the first model includes a model front and a sigmoid unit.
S103, constructing an optimization unit in the first model to obtain a second model with a modified model structure.
In this embodiment, in the second model, the optimizing unit includes a first neuron and a second neuron at a first layer, and a third neuron at a second layer;
The inputs of the first neuron and the second neuron are population attribute labels of training samples in the training sample data set;
the input of the front part of the model is a training sample in a training sample data set;
the output of the front part of the model, the output of the first neuron and the output of the second neuron are respectively connected with the input of the third neuron;
the output of the third neuron is connected with the input of the sigmoid unit;
The output of the sigmoid unit is the output of the second model;
The optimization unit parameters of the optimization unit include a first weight parameter of the first neuron, a first bias parameter of the first neuron, a second weight parameter of the second neuron, a second bias parameter of the second neuron, a third weight parameter of the third neuron, and a third bias parameter of the third neuron.
In this embodiment, the output of the third neuron in the second model is:
o3=(w31w1+w32w2)ci+w33fi+w31b1+w32b2+b3;
wherein, o3 is the output of the third neuron, w1 is the first weight parameter, b1 is the first bias parameter, w2 is the second weight parameter, b2 is the second bias parameter, w31 and w32 are the third weight parameter, b3 is the third bias parameter, fi is the output of the front part of the model for the training sample, and ci is the population attribute label of the training sample;
the preset optimization unit parameter values are specifically as follows:
w31=w33=1;
w32=-1,b2=b3=0,b1=w2;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values, and obtaining a third model, wherein the output of a third neuron in the third model is as follows:
o3=(w1-w2)ci+fi+w2。
S104, setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values, and obtaining a third model.
And S105, performing cyclic optimization on the third model according to the training sample data set to obtain the optimal optimization unit parameter value.
S106, setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model.
And S107, performing fine tuning optimization on the fourth model according to the test sample data set to obtain the fine tuning optimization unit parameter value.
S108, setting the optimization unit parameters of the optimization unit in the fourth model according to the fine adjustment optimization unit parameter values to obtain the optimized target model.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
Therefore, by implementing the model optimization method for enhancing fairness described in this embodiment, after model training is completed, an optimization unit can be added to the model to enhance fairness of the model.
Example 2
Referring to fig. 2, fig. 2 is a flow chart of a model optimization method for enhancing fairness according to the present embodiment. The model optimization method for enhancing fairness comprises the following steps:
s201, a training sample data set and a testing sample data set are obtained, wherein the training sample data set comprises a discriminated training sample set and a normal training sample set, and the testing sample data set comprises a discriminated testing sample set and a normal testing sample set.
In this embodiment, the training sample data set is Tr and the test sample data set is Te.
In this example, the discrimination population is Ga and the normal population is Gb.
In this embodiment, the discriminated training sample set is Tra, and the normal training sample set is Trb.
In this embodiment, the training sample data set of the machine learning model is Tr,Tr, which is composed of two parts, i.e., a sample set Tra of the discriminated population Ga and a sample set Trb of the normal population Gb.
In this embodiment, the discriminated test sample set is Tea and the normal test sample set is Teb.
In this embodiment, the test sample data set of the machine learning model is Te,Te, which is composed of two parts, i.e., a sample set Tea of the discriminated population Ga and a sample set Teb of the normal population Gb.
In this embodiment, the machine learning model predicts the samples in Tr as class "1" or "0". And population Ga is discriminated means that the probability of the sample in population Ga being model predicted as category "1" is lower than the probability of the sample in population Gb being model predicted as category "1". It will be appreciated that the relatively low positive going rate of the sample population is considered to be discriminated from the sample population.
S202, performing preliminary training of a two-class machine learning model according to a training sample data set to obtain a first model.
In this embodiment, the method uses Tr as the training sample data set to train the two-classification machine learning model M (i.e., the first module). Wherein, the last layer of the model M adopts a sigmoid unit, and the rest part of the model M after the last layer of the sigmoid unit is removed is marked as the front part of the model M (namely the front part of the model).
In this embodiment, L is the cross entropy loss function of model M on training sample dataset Tr.
In this embodiment, the method ends the model preliminary training when the L convergence or training round exceeds a threshold Rth, where Rth is a preset positive integer constant.
S203, constructing an optimization unit in the first model to obtain a second model with a modified model structure.
Referring to fig. 3, fig. 3 shows a schematic structural diagram of a second model. Wherein the optimization unit comprises three neurons which are divided into two layers, a first layer comprising 2 neurons n1 and n2 and a second layer comprising 1 neuron n3. As can be seen from fig. 3, the inputs of neurons n1 and n2 are both the population attribute labels of the ith training sample xi, ci,ci is the input, ci is 1 or 0, where ci =0 indicates that the training sample xi belongs to the normal population Gb,ci =1, and that the training sample xi belongs to the discriminated population Ga, the outputs of neurons n1 and n2 and the output of the front part of the model M are the inputs of neuron n3, the output of neuron n3 is the input of the Sigmoid unit, and the output of the Sigmoid unit is yi.
In the embodiment, the second model added with the optimizing unit is denoted as M', wherein i is more than or equal to 1 and less than or equal to N, i and N are positive integers, and N is the total number of training samples.
S204, setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model.
S205, setting the value of the second weight parameter to 0, and keeping the value unchanged.
In this embodiment, step S205 is performed, and the round r of optimization training is initialized to 1.
S206, continuously increasing the value of the first weight parameter from 0 according to a first preset value increasing algorithm until the difference between the first positive checking rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set is within a first preset range, determining the value of the first weight parameter as the value of the first weight parameter, and calculating the first accuracy rate of the third model on the training sample data set.
In this embodiment, the difference between the first positive check rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set may be positive or negative. Wherein the first preset range is [ -A, A ], A is a first preset non-negative real number.
In this embodiment, the execution condition of this step may also be "when the absolute value of the difference between the first positive detection rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set is smaller than a", where a is a first preset non-negative real number.
In this embodiment, the machine learning model predicts samples with group attribute labels of "1" as samples of category "1" as TP (True Positive) on the training sample dataset;
The number of samples with the group attribute label of '1' predicted to be the category of '0' is FN (False Negative);
The number of samples with the group attribute label of 0 predicted as the category of 1 is FP (False Positive);
the number of samples predicted to be of category "0" for the samples with the population attribute tag of "0" is noted TN (Ture Negative).
It can be seen that the label "1" → "1" is TP, the label "1" → "0" is FN, the label "0" → "1" is FP, and the label "0" → "0" is TN.
In this embodiment, the classification threshold value of the model output for the sigmoid unit determined by the test on the test sample set Te is recorded as TH e (0, 1), that is, the model output is greater than or equal to TH, the prediction class is "1", otherwise the prediction class is "0".
In this embodiment, the positive rate of the model on the training sample set is defined as:
PR=(TP+FP)/(TP+TN+FP+FN);
wherein Cha Yang rate may also be referred to as the detection rate of tag "1";
on the other hand, the accuracy is:
ACC=(TP+TN)/(TP+TN+FP+FN)。
Based on this, the positive finding rate of the model on the sample set Tra、Trb was denoted PRra and PRrb, respectively.
In this embodiment, the first preset value increasing algorithm may be a step algorithm (arithmetic progressive algorithm) or an adaptive algorithm (algorithm for adaptively adjusting the increasing amount based on the result). The second preset value increasing algorithm and the value decreasing algorithm are the same.
S207, determining the first accuracy as the historical maximum accuracy.
S208, determining the value of the first weight parameter and the value of the second weight parameter as the value of the initial optimization unit parameter.
For example, the "model initial optimization" step of steps S205 to S208 includes:
When r=1, keeping w2 =0 unchanged, increasing the value of w1 (i.e. the value of the first weight parameter) continuously from 0, calculating the positive checking rates PRra and PRrb of the model M 'on the training sample sets Tra and Trb until PRra≈PRrb, wherein the value of w1 is marked as w1m (i.e. the value of the first weight parameter), calculating the accuracy ACCr of the model M' on the training sample set Tr, and letting the historical maximum accuracy ACCmax=ACCr, r=r+1, and the optimal parameter wopt=[w1,w2. Let w1 =0.
In this embodiment, the value of the first weight parameter represents a first weight parameter that has not yet been determined, and the value of the first weight parameter represents a determined first weight parameter. The representation of the second weight parameter is the same.
S209, setting the value of the first weight parameter to be 0, determining the value of the first weight parameter as the current circulation value, and keeping unchanged.
In this embodiment, when step S209 is performed, the round r of the optimization training is changed to 2.
And S210, continuously reducing the value of the second weight parameter from 0 according to a preset value reduction algorithm until the difference value between the third Cha Yang rate of the third model on the discriminated training sample set and the fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determining the value of the second weight parameter as the value of the second weight parameter, and calculating the second accuracy rate of the third model on the training sample data set.
In this embodiment, the difference between the third Cha Yang rate of the third model on the discriminated training sample set and the fourth Cha Yang rate of the third model on the normal training sample set may be positive or negative. Wherein the second preset range is [ -B, B ], B is a second preset non-negative real number.
In this embodiment, the execution condition of this step may also be "when the absolute value of the difference between the third Cha Yang rate of the third model on the discriminated training sample set and the fourth Cha Yang rate of the third model on the normal training sample set is smaller than B". Wherein B is a second preset non-negative real number.
In this embodiment, when step S210 is performed for the first time, the round r of optimization training is 2, and when the loop is performed, r=r+1 once per loop.
S211, when the second accuracy rate is larger than the historical maximum accuracy rate, determining the second accuracy rate as the historical maximum accuracy rate.
As an alternative embodiment, the method further comprises:
And when the second accuracy rate is not greater than the historical maximum accuracy rate, triggering and executing the second preset value increasing algorithm to increase the value of the first weight parameter to obtain the current circulating value, and keeping unchanged.
In this embodiment, the trigger in the above steps is step S213.
S212, the current cycle value and the second weight parameter value are tentatively set as the optimal optimization unit parameter value.
S213, increasing the value of the first weight parameter according to a second preset value increasing algorithm to obtain the current circulation value, and keeping the current circulation value unchanged.
S214, judging whether the current cycle value is larger than the first weight parameter value, if so, executing step S215, and if not, executing step S210.
S215, determining the tentative optimal optimization unit parameter value as the optimal optimization unit parameter value.
For example, the steps of "model loop optimization" described in steps S209-S215 include terminating the loop execution of the "model loop optimization" step if w1>w1m, calculating the positive finding rates PRra and PRrb of the model M ' on the training sample sets Tra and Trb, respectively, if w1≤w1m, keeping the value of w1 unchanged, continuously reducing the value of w2 from 0, and calculating the positive finding rates PRra and PRrb of the model M ' on the training sample sets Tra and Trb until PRra≈PRrb, and calculating the accuracy rate ACCr of the model M ' on the training sample set Tr at this time. at this time, if ACCr>ACCmax, let ACCmax=ACCr, r=r+1, the optimal parameter wopt=[w1,w2, increase the value of w1 appropriately, and execute the "model loop optimization" step in a loop, and if ACCr≤ACCmax, let r=r+1, increase the value of w1 appropriately, and execute the "model loop optimization" step in a loop.
S216, setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model.
S217, performing fine tuning optimization on the fourth model according to the test sample data set to obtain fine tuning optimization unit parameter values.
In this embodiment, the specific step of updating the final model described in step S217 includes assigning the values of w1 and w2 stored in the optimal parameters wopt to the corresponding weight parameters of the optimizing unit. And (3) fine-tuning the values of w1 and w2 on a test sample set Te by referring to the steps of model initial optimization and model cyclic optimization, wherein the fine-tuned values of w1 and w2 are final values of weight parameters corresponding to an optimization unit.
S218, setting the optimization unit parameters of the optimization unit in the fourth model according to the fine adjustment optimization unit parameter values, and obtaining the optimized target model.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
Therefore, by implementing the model optimization method for enhancing fairness described in this embodiment, after model training is completed, an optimization unit can be added to the model to enhance fairness of the model.
Example 3
Embodiment 3 provides a schematic flow diagram of an example flow of a model optimization method for enhancing fairness applied to a job hunting resume intelligent screening model. The process comprises the following steps:
In the intelligent screening scenario of job hunting resume, the job hunting resume contains user attribute information data of the recruiter (including job position, job category, work/practice experience, personal skills, graduation universities, learned professions, personal academy, gender, age).
S301, digitizing all user attribute information data contained in each recruiter resume, and constructing a 9-dimensional vector by using the 9 quantized data, wherein each 9-dimensional vector is a piece of sample data.
In this embodiment, this step may correspond to four attribute information of a job position (such as a development position, a sales position, a pre-sales position, a management position, an administrative position, a research position, a financial position, a logistics position), a job taker category (a job taker, a social person), a learned specialty (a professional range meeting the job requirement is met, a professional range not meeting the job requirement is met), sex (male, female) are digitized according to a single thermal code, and five attributes of work/practice experience (total number of training and work months), personal skills (number of skills meeting the job requirement), a college (national 985 colleges or foreign QS front 100 colleges, national non-985 colleges or foreign QS front 300 colleges, national ordinary colleges or foreign QS front 300 colleges), personal academs (doctor, filling, family, colleges, and universities) and age are digitized according to positive integers.
S302, a sample data set for model training is formed by a plurality of pieces of sample data corresponding to the plurality of recruit resume, wherein the discriminated group Ga is a female recruiter, the normal group Gb is a male recruiter, and the Tr is composed of a sample data set Tra of the discriminated group Ga and a sample data set Trb of the normal group Gb.
In this embodiment, the method is used to eliminate the discrimination described above.
In this embodiment, the machine learning model predicts the sample data in Tr as a category of "1" or "0", where "1" indicates that the recruiter resume corresponding to the sample data meets the recruitment requirement, and "0" indicates that the recruiter resume corresponding to the sample data does not meet the recruitment requirement.
In this embodiment, one of the training requirements of the job hunting resume intelligent screening model is that the result output by the model cannot discriminate against gender, i.e. males and females enjoy equal opportunity to be engaged. Whether the male and female share equal opportunity to be used can be specifically determined by using the model to determine whether the positive checking rates PRea and PReb on the test sample sets Tea and Teb are equal.
S303, marking the round of model optimization training as r, and initializing r to 1. Training a two-class machine learning model M (such as a ResNet deep neural network) by taking Tr as training data, wherein a last layer of the model M adopts a sigmoid unit, the rest part of the model M after the last layer of the sigmoid unit is removed is marked as the front part of the model M, and L is a cross entropy loss function of the model M on a data set Tr. Model preliminary training is ended when the L convergence or training round exceeds a threshold Rth =1000.
S304, constructing an optimization unit by using three neurons.
In this embodiment, the optimization unit is divided into two layers, the first layer containing 2 neurons n1 and n2, and the second layer containing 1 neuron n3. The model with the addition of the optimization unit is denoted as M'. The output of neuron n3 is:
o3=(w31w1+w32w2)ci+w33fi+w31b1+w32b2+b3;
wherein w31=w33 = 1;
w32=-1;b2=b3=0;b1=w2。
S305, when r=1, keeping w2 =0, starting from 0, increasing the value of w1 continuously, calculating the positive checking rates PRra and PRrb of the model M 'on the training sample sets Tra and Trb until PRra≈PRrb, recording the value of w1 as w1m, calculating the accuracy ACCr of the model M' on the training sample set Tr, making the historical maximum accuracy ACCmax=ACCr, r=r+1, and the optimal parameter wopt=[w1,w2, and making w1 =0.
S306, judging whether w1 is larger than w1m, if w1>w1m, terminating the loop execution of step S306, if w1≤w1m, respectively calculating the positive checking rates PRra and PRrb of the model M ' on the training sample sets Tra and Trb, keeping the value of w1 unchanged, continuously reducing the value of w2 from 0, calculating the positive checking rates PRra and PRrb of the model M ' on the training sample sets Tra and Trb until PRra≈PRrb, and calculating the accuracy rate ACCr of the model M ' on the training sample set Tr at the moment.
At this time, if ACCr>ACCmax, let ACCmax=ACCr, r=r+1, the optimal parameter wopt=[w1,w2, increase w1, and loop through step S306;
If ACCr≤ACCmax, let r=r+1, increase w1, and loop through step S306.
S307, the values of w1 and w2 stored in the optimal parameters wopt are assigned to the corresponding weight parameters of the optimizing unit. Referring to steps S305 and S306, the values of w1 and w2 are fine-tuned on the test sample set Te, and the fine-tuned values of w1 and w2 are final values of the weight parameters corresponding to the optimizing unit.
As an alternative implementation, the method may include, when training the job hunting resume screening model:
the method comprises the steps of obtaining a training sample data set and a test sample data set, wherein the training sample data set comprises a first female recruiter resume set and a first male recruiter resume set;
performing preliminary training of a two-class machine learning model according to the training sample data set to obtain a first model;
an optimization unit is built in the first model to obtain a second model with a modified model structure;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values to obtain a third model;
Performing cyclic optimization on the third model according to the training sample data set to obtain an optimal optimization unit parameter value;
Setting the optimization unit parameters of the optimization unit in the third model according to the optimal optimization unit parameter values to obtain a fourth model;
performing fine tuning optimization on the fourth model according to the test sample data set to obtain fine tuning optimization unit parameter values;
And setting the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values to obtain the job hunting resume screening model for completing optimization.
As a further alternative embodiment, the method further comprises:
Acquiring a resume of an recruiter;
inputting the recruiter into a job hunting resume screening model, and outputting a screening result of whether the recruiter resume meets recruitment requirements.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
Therefore, by implementing the model optimization method for enhancing fairness described in the embodiment, an optimization unit can be added in the job-seeking resume intelligent screening model to improve fairness of the model, so that a certain existing discrimination is eliminated.
Example 4
Referring to fig. 4, fig. 4 is a schematic structural diagram of a model optimizing apparatus for enhancing fairness according to the present embodiment. As shown in fig. 4, the fairness-enhancing model optimizing apparatus includes:
An acquisition unit 410 for acquiring a training sample data set and a test sample data set, wherein the training sample data set comprises a discriminated training sample set and a normal training sample set;
The preliminary training unit 420 is configured to perform preliminary training of the two classification machine learning models according to the training sample data set, so as to obtain a first model;
a structure modifying unit 430, configured to construct an optimizing unit in the first model, to obtain a second model with a modified model structure;
a parameter setting unit 440, configured to set the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values, so as to obtain a third model;
The circulation optimization unit 450 is configured to perform circulation optimization on the third model according to the training sample data set, so as to obtain an optimal optimization unit parameter value;
the parameter determining unit 460 is configured to set an optimization unit parameter of the optimization unit in the third model according to the optimal optimization unit parameter value, so as to obtain a fourth model;
The fine tuning optimization unit 470 is configured to perform fine tuning optimization on the fourth model according to the test sample data set, so as to obtain a parameter value of the fine tuning optimization unit;
the model generating unit 480 is configured to set the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values, so as to obtain the optimized target model.
In this embodiment, the explanation of the model optimizing apparatus for enhancing fairness may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
It can be seen that implementing the model optimizing device for enhancing fairness described in this embodiment can improve fairness of a model by adding an optimizing unit to the model after model training is completed.
Example 5
Referring to fig. 5, fig. 5 is a schematic structural diagram of a model optimizing apparatus for enhancing fairness according to the present embodiment. As shown in fig. 5, the fairness-enhancing model optimizing apparatus includes:
An acquisition unit 410 for acquiring a training sample data set and a test sample data set, wherein the training sample data set comprises a discriminated training sample set and a normal training sample set;
The preliminary training unit 420 is configured to perform preliminary training of the two classification machine learning models according to the training sample data set, so as to obtain a first model;
a structure modifying unit 430, configured to construct an optimizing unit in the first model, to obtain a second model with a modified model structure;
a parameter setting unit 440, configured to set the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values, so as to obtain a third model;
The circulation optimization unit 450 is configured to perform circulation optimization on the third model according to the training sample data set, so as to obtain an optimal optimization unit parameter value;
the parameter determining unit 460 is configured to set an optimization unit parameter of the optimization unit in the third model according to the optimal optimization unit parameter value, so as to obtain a fourth model;
The fine tuning optimization unit 470 is configured to perform fine tuning optimization on the fourth model according to the test sample data set, so as to obtain a parameter value of the fine tuning optimization unit;
the model generating unit 480 is configured to set the optimization unit parameters of the optimization unit in the fourth model according to the fine tuning optimization unit parameter values, so as to obtain the optimized target model.
In this embodiment, the training cutoff condition of the first model is that the preset cross entropy loss function converges or the training cycle passes a preset iteration threshold;
The first model comprises a model front part and a sigmoid unit;
In the second model, the optimization unit includes a first neuron and a second neuron at a first layer, and a third neuron at a second layer;
The inputs of the first neuron and the second neuron are population attribute labels of training samples in the training sample data set;
the input of the front part of the model is a training sample in a training sample data set;
the output of the front part of the model, the output of the first neuron and the output of the second neuron are respectively connected with the input of the third neuron;
the output of the third neuron is connected with the input of the sigmoid unit;
The output of the sigmoid unit is the output of the second model;
The optimization unit parameters of the optimization unit include a first weight parameter of the first neuron, a first bias parameter of the first neuron, a second weight parameter of the second neuron, a second bias parameter of the second neuron, a third weight parameter of the third neuron, and a third bias parameter of the third neuron.
In this embodiment, the output of the third neuron in the second model is:
o3=(w31w1+w32w2)ci+w33fi+w31b1+w32b2+b3;
wherein, o3 is the output of the third neuron, w1 is the first weight parameter, b1 is the first bias parameter, w2 is the second weight parameter, b2 is the second bias parameter, w31 and w32 are the third weight parameter, b3 is the third bias parameter, fi is the output of the front part of the model for the training sample, and ci is the population attribute label of the training sample;
the preset optimization unit parameter values are specifically as follows:
w31=w33=1;
w32=-1,b2=b3=0,b1=w2;
setting the optimization unit parameters of the optimization unit in the second model according to the preset optimization unit parameter values, and obtaining a third model, wherein the output of a third neuron in the third model is as follows:
o3=(w1-w2)ci+fi+w2。
As an alternative embodiment, the loop optimization unit 450 includes:
An initial optimization subunit 451, configured to perform model initial optimization on the third model according to the training sample data set, so as to obtain an initial optimization unit parameter value;
And the cyclic optimization subunit 452 is configured to perform model cyclic optimization on the third model according to the training sample data set and the initial optimization unit parameter value, so as to obtain an optimal optimization unit parameter value.
As an alternative embodiment, the initial optimization subunit 451 includes:
the first setting module is used for setting the value of the second weight parameter to 0 and keeping the value unchanged;
The initial optimization module is used for continuously increasing the value of the first weight parameter from 0 according to a first preset value increasing algorithm until the difference value between the first positive checking rate of the third model on the discriminated training sample set and the second Cha Yang rate of the third model on the normal training sample set is within a first preset range, determining the value of the first weight parameter as the value of the first weight parameter, and calculating the first accuracy rate of the third model on the training sample data set;
the first determining module is used for determining the first accuracy as the historical maximum accuracy;
the first determining module is further configured to determine the first weight parameter value and the second weight parameter value as the initial optimization unit parameter value.
As an alternative embodiment, the loop optimization subunit 452 includes:
The second setting module is used for setting the value of the first weight parameter to be 0, determining the value of the first weight parameter as the current circulation value and keeping the value unchanged;
the cyclic optimization module is used for continuously reducing the value of the second weight parameter from 0 according to a preset value reduction algorithm until the difference value between a third Cha Yang rate of the third model on the discriminated training sample set and a fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determining the value of the second weight parameter as the value of the second weight parameter, and calculating the second accuracy rate of the third model on the training sample data set;
The second determining module is used for determining the second accuracy rate as the historical maximum accuracy rate when the second accuracy rate is larger than the historical maximum accuracy rate;
The tentative module is used for tentatively setting the current cycle value and the second weight parameter value as the optimal optimization unit parameter value;
The calculation module is used for increasing the value of the first weight parameter according to a second preset value increasing algorithm to obtain a current circulation value, and the current circulation value is kept unchanged;
the judging module is used for judging whether the current circulation value is larger than the first weight parameter value or not;
The second determining module is further configured to determine, when the determination result of the determining module is yes, the tentative optimal optimization unit parameter value as the optimal optimization unit parameter value;
And the second setting module is further configured to, when the judging result of the judging module is no, trigger the cycle optimization module to execute to continuously decrease the value of the second weight parameter from 0 according to the preset value decrease algorithm until the difference between the third Cha Yang rate of the third model on the discriminated training sample set and the fourth Cha Yang rate of the third model on the normal training sample set is within a second preset range, determine the value of the second weight parameter as the value of the second weight parameter, and calculate the second accuracy of the third model on the training sample data set.
As an optional implementation manner, the judging module is further configured to trigger the calculating module to execute the above-mentioned increasing the value of the first weight parameter according to the second preset value increasing algorithm when the second accuracy is not greater than the historical maximum accuracy, so as to obtain the current cyclic value, and keep unchanged.
In this embodiment, the explanation of the model optimizing apparatus for enhancing fairness may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
It can be seen that implementing the model optimizing device for enhancing fairness described in this embodiment can improve fairness of a model by adding an optimizing unit to the model after model training is completed.
An embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute a model optimization method for enhancing fairness in embodiment 1 or embodiment 2 of the present application.
Embodiments of the present application provide a computer readable storage medium storing computer program instructions that, when read and executed by a processor, perform the fairness-enhancing model optimization method of embodiments 1 or 2 of the present application.
An embodiment of the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the fairness enhancing model optimization method of embodiment 1 or embodiment 2 of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.