Summary of the invention
The purpose of this specification one or more embodiment be to provide it is a kind of based on label propagate information forecasting method andDevice, to realize that the mode propagated based on label carries out the high efficiency and high-accuracy of information prediction.
In order to solve the above technical problems, this specification one or more embodiment is achieved in that
On the one hand, this specification one or more embodiment provides a kind of information forecasting method propagated based on label, packetIt includes:
Obtain the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family;
According to first label and the behavioural information, the initial labels of the user to be predicted are determined;
Determine the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to first label, and described in utilizationDynamic labels propagation algorithm is iterated the initial labels;
When the iteration meets the default condition of convergence, determine that the label that last time iteration obtains is the use to be predictedSecond label of the specified type at family.
In one embodiment, described according to first label and the behavioural information, determine the user to be predictedInitial labels, comprising:
According to the behavioural information, the behavior similarity between each sample of users and the user to be predicted is determined;
From filtering out the highest first sample of behavior similarity between the user to be predicted in each sample of usersUser;
Determine that corresponding first label of the first sample user is the initial labels of the user to be predicted.
In one embodiment, the corresponding dynamic labels propagation algorithm of each secondary iteration of the determination;And according to describedFirst label, and the initial labels are iterated using the dynamic labels propagation algorithm, comprising:
Iteration following steps, until the label matrix obtained after iteration meets the default condition of convergence:
It determines when time corresponding first label matrix of iteration and the first label propogator matrix;Wherein, first iteration is correspondingLabel matrix is built-up according to first label and the initial labels;
The product of the first label propogator matrix and first label matrix is calculated, and determines that the product is next timeThe corresponding label matrix to be processed of iteration;
The label matrix to be processed is normalized;And according to first label and preset confidence threshold value,At least one matrix value in the label matrix to be processed after normalization is updated, it is corresponding to obtain the next iterationThe second label matrix;
According to second label matrix, each matrix value in the first label propogator matrix is updated, is obtainedThe corresponding second label propogator matrix of iteration next time;
According to the second label propogator matrix and second label matrix, the corresponding third mark of lower next iteration is determinedSign matrix.
In one embodiment, described according to first label and preset confidence threshold value, to the institute after normalizationAt least one matrix value stated in label matrix to be processed is updated, comprising:
For the first matrix value corresponding with first label in the label matrix to be processed after normalization, by instituteIt states the first matrix value and is updated to the value to match with first label;And
For the second matrix value in the label matrix to be processed after normalization in addition to first matrix value, rootAccording to the preset confidence threshold value, at least one described second matrix value is updated.
In one embodiment, first label includes positive label and/or negative label;
It is described according to the preset confidence threshold value, at least one described second matrix value is updated, comprising:
If second matrix value is greater than the first confidence threshold value, second matrix value is updated to the positive markLabel;
If second matrix value is updated to the negative mark less than the second confidence threshold value, by second matrix valueLabel;
Wherein, second confidence threshold value is less than first confidence threshold value.
In one embodiment, described according to second label matrix, to each in the first label propogator matrixMatrix value is updated, comprising:
In second label matrix, if the second sample of users label corresponding with the first user to be predicted is identical,Increase matrix value corresponding with second sample of users and first user to be predicted in the first label propogator matrix;
If the second sample of users label corresponding from the first user to be predicted is different, reduces first label and propagate squareMatrix value corresponding with second sample of users and first user to be predicted in battle array.
In one embodiment, the default condition of convergence includes at least one of the following:
Second label described in each user to be predicted can determine that rate reaches the first preset threshold;
The number of iterations reaches the second preset threshold;
When the label that secondary iteration obtains label with the last time obtains is identical.
In one embodiment, the first label of the specified type is credible label.
On the other hand, this specification one or more embodiment provides a kind of information prediction device propagated based on label,Include:
Obtain module, the first label of the specified type for obtaining multiple sample of users;And it obtains each sample and usesThe behavioural information of family and user to be predicted;
First determining module, for determining the user's to be predicted according to first label and the behavioural informationInitial labels;
Second determining module, for determining the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to describedOne label, and the initial labels are iterated using the dynamic labels propagation algorithm;
Third determining module, for determining what last time iteration obtained when the iteration meets the default condition of convergenceLabel is the second label of the specified type of the user to be predicted.
In one embodiment, first determining module includes:
First determination unit, for determining each sample of users and the user to be predicted according to the behavioural informationBetween behavior similarity;
Screening unit, for from filtering out between the user to be predicted behavior similarity in each sample of users mostHigh first sample user;
Second determination unit, for determining that corresponding first label of the first sample user is the use to be predictedThe initial labels at family.
In one embodiment, second determining module includes:
Iteration unit is used for iteration following steps, until the label matrix obtained after iteration meets the default convergence itemPart:
It determines when time corresponding first label matrix of iteration and the first label propogator matrix;Wherein, first iteration is correspondingLabel matrix is built-up according to first label and the initial labels;
The product of the first label propogator matrix and first label matrix is calculated, and determines that the product is next timeThe corresponding label matrix to be processed of iteration;
The label matrix to be processed is normalized;And according to first label and preset confidence threshold value,At least one matrix value in the label matrix to be processed after normalization is updated, it is corresponding to obtain the next iterationThe second label matrix;
According to second label matrix, each matrix value in the first label propogator matrix is updated, is obtainedThe corresponding second label propogator matrix of iteration next time;
According to the second label propogator matrix and second label matrix, the corresponding third mark of lower next iteration is determinedSign matrix.
In one embodiment, the iteration unit is also used to:
For the first matrix value corresponding with first label in the label matrix to be processed after normalization, by instituteIt states the first matrix value and is updated to the value to match with first label;And
For the second matrix value in the label matrix to be processed after normalization in addition to first matrix value, rootAccording to the preset confidence threshold value, at least one described second matrix value is updated.
In one embodiment, the positive label of first label and/or negative label;
The iteration unit is also used to:
If second matrix value is greater than the first confidence threshold value, second matrix value is updated to the positive markLabel;
If second matrix value is updated to the negative mark less than the second confidence threshold value, by second matrix valueLabel;
Wherein, second confidence threshold value is less than first confidence threshold value.
In one embodiment, the iteration unit is also used to:
In second label matrix, if the second sample of users label corresponding with the first user to be predicted is identical,Increase matrix value corresponding with second sample of users and first user to be predicted in the first label propogator matrix;
If the second sample of users label corresponding from the first user to be predicted is different, reduces first label and propagate squareMatrix value corresponding with second sample of users and first user to be predicted in battle array.
In one embodiment, the default condition of convergence includes at least one of the following:
Second label described in each user to be predicted can determine that rate reaches the first preset threshold;
The number of iterations reaches the second preset threshold;
When the label that secondary iteration obtains label with the last time obtains is identical.
In one embodiment, the first label of the specified type is credible label.
In another aspect, this specification one or more embodiment provides a kind of information prediction equipment propagated based on label,Include:
Processor;And
It is arranged to the memory of storage computer executable instructions, the executable instruction makes the place when executedManage device:
Obtain the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family;
According to first label and the behavioural information, the initial labels of the user to be predicted are determined;
Determine the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to first label, and described in utilizationDynamic labels propagation algorithm is iterated the initial labels;
When the iteration meets the default condition of convergence, determine that the label that last time iteration obtains is the use to be predictedSecond label of the specified type at family.
In another aspect, the embodiment of the present application provides a kind of storage medium, for storing computer executable instructions, it is described canIt executes instruction and realizes following below scheme when executed:
Obtain the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family;
According to first label and the behavioural information, the initial labels of the user to be predicted are determined;
Determine the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to first label, and described in utilizationDynamic labels propagation algorithm is iterated the initial labels;
When the iteration meets the default condition of convergence, determine that the label that last time iteration obtains is the use to be predictedSecond label of the specified type at family.
It, can be according to the mark of the specified type of sample of users using the technical solution of this specification one or more embodimentLabel and each sample of users and the behavioural information of user to be predicted determine the initial labels of user to be predicted, so that being markedDuring label are propagated, the initial labels of the derandominzation of user to be predicted are more bonded the label of sample of users, thus favorablyThe number of iterations during reducing information prediction improves the stability and efficiency of information prediction process;In addition, the technical solutionDuring information prediction, using dynamic labels propagation algorithm, i.e., each iteration uses different label propagation algorithms, becauseThis can be realized the dynamic optimization to algorithm during information prediction, and then be conducive to improve the efficiency of information prediction process;AndAnd manual intervention is not necessarily to during entire information prediction, greatly reduce manual operation bring risk.
Specific embodiment
This specification one or more embodiment provides a kind of information forecasting method and device propagated based on label, toRealize the high efficiency and high-accuracy that information prediction is carried out based on the mode that label is propagated.
In order to make those skilled in the art more fully understand the technical solution in this specification one or more embodiment,Below in conjunction with the attached drawing in this specification one or more embodiment, to the technology in this specification one or more embodimentScheme is clearly and completely described, it is clear that and described embodiment is only this specification a part of the embodiment, rather thanWhole embodiments.Based on this specification one or more embodiment, those of ordinary skill in the art are not making creativenessThe model of this specification one or more embodiment protection all should belong in every other embodiment obtained under the premise of labourIt encloses.
Fig. 1 is the schematic flow according to a kind of information forecasting method propagated based on label of one embodiment of this specificationFigure, as shown in Figure 1, this method comprises:
S102 obtains the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family.
S104 determines the initial labels of user to be predicted according to the first label and behavioural information.
S106 determines the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to the first label, and utilize dynamicState label propagation algorithm is iterated initial labels.
Wherein, dynamic labels propagation algorithm refers to the label propagation algorithm of variation.Therefore, in the present embodiment, each secondary iterationUsed label propagation algorithm is different, and how to determine label propagation algorithm used by each secondary iteration, will be subsequentIt is described in detail in embodiment.
S108 determines that the label that last time iteration obtains is user to be predicted when iteration meets the default condition of convergenceSpecified type the second label.
The technical solution of the present embodiment can be applied to propagate a plurality of types of labels, to predict user's corresponding typesInformation.For example, predicting the confidence level of user based on the propagation of the credible label of user;Propagation based on user's personality label comePredict the personality feature of user;The behavioural information of user is predicted based on the propagation of user behavior label;Etc..
When the first label of specified type is credible label (whether credible label is credible for identity user), above-mentioned letterCease prediction technique due to can based on a small quantity with credible label sample of users behavioural information predict without credible label toIt predicts the credible label of user, therefore can be used for establishing the credible Tag Estimation model of efficient trade company, to promote the knowledge of trade company's riskOther performance effectively hits the malfeasance of trade company, to ensure the property safety of client.
It to sum up, can be according to the specified class of sample of users using the technical solution of this specification one or more embodimentThe label of type and each sample of users and the behavioural information of user to be predicted determine the initial labels of user to be predicted, so thatDuring carrying out label propagation, the initial labels of the derandominzation of user to be predicted are more bonded the label of sample of users, fromAnd the number of iterations during information prediction is advantageously reduced, improve the stability and efficiency of information prediction process;In addition, the skillArt scheme is during information prediction, and using dynamic labels propagation algorithm, i.e., each iteration is propagated using different labelsAlgorithm, therefore can be realized the dynamic optimization to algorithm during information prediction, and then be conducive to improve information prediction processEfficiency;Also, it is not necessarily to manual intervention during entire information prediction, greatly reduces manual operation bring risk.
The information forecasting method provided by the above embodiment propagated based on label described further below.
The first label of the specified type of multiple sample of users is obtained first, and obtains each sample of users and use to be predictedThe behavioural information at family.
With SL={ (xi,yi), i=1 ..., L indicates the sample of users set with known label, SU={ (xi,yi)},i=L+1 ..., L+U indicate user's set to be predicted, and user to be predicted has unknown label.YL={ yi, i=1 ..., L are indicatedThe known label set of sample of users, YU={ yi, i=L+1 ..., L+U indicate the Unknown Label set of user to be predicted, X={xi, i=1 ..., L+U are each sample of users and the behavioural characteristic set of user to be predicted.Under normal conditions, L < < U.ThusIt is found that the target of information forecasting method provided in this embodiment is exactly to utilize X and YLTo YUIt is predicted, to realize specified classThe propagation of the label of type obtains the label of the specified type of unknown subscriber.
When obtaining the label of specified type of sample of users, various ways can be used and obtained, and comprehensively consider moreThe candidate label that kind of mode is got determines the label of sample of users.For example, black and white lists, strong rule, expert are respectively adoptedThe modes such as experience obtain the candidate label of sample of users, and then the modes such as comprehensive analysis black and white lists, strong rule, expertise obtainThe multiple candidate labels arrived, to determine the label of sample of users.
In one embodiment, for same sample of users, if obtaining multiple and different candidate labels, ballot can be usedMode determines the label of the sample of users.That is, the quantity of which kind of candidate label is most in all different candidate labels,Determine that candidate's label is the label of sample of users.
By taking the label of specified type is credible label as an example.Assuming that showing sample in black and white lists for sample of users AThe label of user A is trusted users, shows that the label of sample of users A is trusted users in a certain rule, and is shown in another ruleShow sample of users A label be can not credit household.Since the frequency of occurrence of label " trusted users " " can not credit more than labelThe frequency of occurrence at family ", therefore can determine that the label of sample of users A is " trusted users ".
When executing S102, it is i.e. to be predicted to obtain each sample of users that existing any behavioural information acquisition mode can be usedThe behavioural information of user.When information forecasting method provided in this embodiment is applied to predict whether trade company is credible, behavioural informationIt may include trade company's industry and commerce qualification information, trade company corporate message, trade company manages historical information, trade company manages flowing water information etc..
In addition, can also be screened according to specific business scenario to multinomial behavioural information.It optionally, can be according to every rowIt is screened for the importance of information.For example, if desired acquiring each sample of users and two behavioural informations of user to be predicted, thenHistorical information can be managed from collected trade company's industry and commerce qualification information, trade company corporate message, trade company and trade company manages flowing water informationIn filter out two high information of importance;Alternatively, the importance for if desired acquiring each sample of users and user to be predicted is higher thanA degree of behavioural information then can manage history letter from collected trade company's industry and commerce qualification information, trade company corporate message, trade companyBreath and trade company manage and filter out the behavioural information that different degree is higher than preset threshold in flowing water information.
In the present embodiment, the different degree of every behavioural information can be needed according to specific business scenario to determine.
In the label and each sample of users of acquisition and user to be predicted for getting the specified type of multiple sample of usersAfter behavioural information, determined according to the behavioural information of the label of each sample of users, each sample of users and user to be predicted to be predictedThe initial labels of user.
In one embodiment, it can determine that each sample is used first according to each sample of users and the behavioural information of user to be predictedBehavior similarity between family and user to be predicted;Then from filtering out the behavior phase between user to be predicted in each sample of usersLike the highest first sample user of degree, and determine that the corresponding label of first sample user is the initial labels of user to be predicted.
For example, the behavior similarity matrix between sample of users and user to be predicted is as follows: W ∈ R(L+U)*(L+U), wherein L+U indicates L sample of users and U users to be predicted.So, variance parameter σ is given2, behavior phase between user i and user jIt can be calculated by following Gaussian function (1) like degree:
Wherein, dijIndicate that existing any distance metric can be used in the distance between user i and user j, the distance valueMode calculates, for example, Euclidean distance, cosine similarity, Pearson's coefficient, manhatton distance etc..Specific distance metric sideFormula is repeated no more due to being the prior art.
In the present embodiment, using the label of the highest sample of users of behavior similarity between user to be predicted as to be predictedThe initial labels of user may make the initial value in iterative process closer to sample of users, to accelerate convergence rate, and restrainProbability to globally optimal solution (label for meeting the default condition of convergence) is higher.
After the initial labels for determining user to be predicted, that is, it can determine that the corresponding dynamic labels of each secondary iteration are propagatedAlgorithm, and according to the label of sample of users, it is iterated using initial labels of the dynamic labels propagation algorithm to user to be predicted.
In one embodiment, it is iterated in the form of matrix, then specific iterative process may include following stepsThe termination condition of A1-A5, iteration meet the default condition of convergence for the label matrix obtained after iteration.
Step A1, it determines when time corresponding first label matrix of iteration and the first label propogator matrix.
Wherein, the corresponding label matrix of first iteration is according to the label of sample of users and the initial labels structure of user to be predictedIt builds.The matrix that the label matrix of first iteration is (L+U) * 2, L+U indicate L sample of users and U users to be predicted.
The label propogator matrix T of first iterationijIt (2) can be calculated according to the following formula, label propogator matrix TijFor (L+ U) * (L+U) matrix, L+U indicates L sample of users and U users to be predicted, and the credible label of expression trade company j propagates to quotientThe probability of family i.
Wherein, WijIndicate the behavior similarity between user i and user j.
In addition, can also be directly using the behavior similarity between each user as label propogator matrix TijEach matrix value.
If the label matrix of first iteration has met the default condition of convergence, no longer need to execute following steps, it can direct rootThe label of user to be predicted is determined according to the label matrix of first iteration.If the label matrix of first iteration is unsatisfactory for default convergenceCondition is then iterated as steps described below.
Step A2, the product of the first label propogator matrix and the first label matrix is calculated, and determines that the product changes for next timeFor corresponding label matrix to be processed.
Step A3, label matrix to be processed is normalized;And label and preset confidence level according to sample of usersThreshold value is updated at least one matrix value in the label matrix to be processed after normalization, it is corresponding to obtain next iterationSecond label matrix.
Specifically, for the first matrix corresponding with the label of sample of users in the label matrix to be processed after normalizationFirst matrix value, is updated to the value to match with the label of sample of users by value;And for the label square to be processed after normalizationThe second matrix value in addition to the first matrix value in battle array, according to preset confidence threshold value, at least one second matrix value intoRow updates.
For above-mentioned first matrix value, by the first matrix value and sample of users in the label matrix to be processed after normalizationLabel be compared, these the first matrix values can be replaced with to the label value of sample of users.Wherein, label value is and is used to markKnow the numerical value or character of label.For example, credible label characterizes by the way of 0,1, wherein 0 indicates the credible label of userIndicate that the credible label of user is " trusted users " for " can not credit household ", 1.For in label matrix to be processed with sample of usersThe corresponding position of label value " 0 " the first matrix value, can should (a little) first matrix value replace with " 0 ", for be processedFirst matrix value of position corresponding with the label value of sample of users " 1 " in label matrix, can should (a little) first matrix valueReplace with " 1 ".
Assuming that the label of sample of users includes positive label and/or negative label.Therefore, for the second matrix value, if the second squareBattle array value is greater than the first confidence threshold value, then the second matrix value is updated the label that is positive;If the second matrix value is less than the second confidence levelSecond matrix value is then updated the label that is negative by threshold value;Wherein, the second confidence threshold value is less than the first confidence threshold value.
If the second label matrix has met the default condition of convergence, no longer need to execute following steps, it can be according to the second labelMatrix determines the label of user to be predicted.If the second label matrix is unsatisfactory for the default condition of convergence, continue iteration.
Step A4, according to the second label matrix, each matrix value in the first label propogator matrix is updated, is obtained downThe corresponding second label propogator matrix of secondary iteration.
In the step, in the second label matrix, if the second sample of users label phase corresponding with the first user to be predictedTogether, then increase matrix value corresponding with the second sample of users and the first user to be predicted in the first label propogator matrix;If secondSample of users label corresponding from the first user to be predicted is different, then reduce in the first label propogator matrix with the second sample of usersAnd the first corresponding matrix value of user to be predicted.
When increasing or reducing the matrix value in label propogator matrix, can be increased or reduced according to preset amplitude.
Step A5, according to the second label propogator matrix and the second label matrix, the corresponding third mark of lower next iteration is determinedSign matrix.
In the step, the mode that step A2-A3 can be used determines the corresponding third label matrix of lower next iteration.
If third label matrix has met the default condition of convergence, no longer need to execute following steps, it can be according to third labelMatrix determines the label of user to be predicted.If third label matrix is unsatisfactory for the default condition of convergence, return step A1 continuesIt is iterated.
In one embodiment, the default condition of convergence includes at least one of the following:
A, the label of each user to be predicted can determine that rate reaches the first preset threshold.
Wherein, the label of user to be predicted can determine that rate refers to: in label matrix corresponding to the label of each user to be predictedMatrix value determined quantity accounts for the ratios of all matrix values.It can be seen from the above, corresponding to the label of user to be predictedMatrix value is the second matrix value, after being updated according to preset confidence threshold value to the second matrix value, be updated secondMatrix value just belongs to the matrix value determined, and the second matrix value not being updated then belongs to still undetermined matrix value.
B, the number of iterations reaches the second preset threshold.
C, when the label that secondary iteration obtains label with the last time obtains is identical.
Illustrate information forecasting method provided by above-described embodiment below by way of a specific embodiment.
In one embodiment, the label of specified type is credible label, and credible label includes positive label or negative markLabel, wherein positive tag representation trade company is credible trade company, and negative label then indicates that trade company is insincere trade company.Known trade company A's is credibleLabel is positive label, and the credible label of trade company B is negative label.The credible label of trade company C to be predicted is unknown.Now, according to trade company AThat predicts trade company C to be predicted with the behavioural information of the credible label and trade company A of trade company B, trade company B and trade company C to be predicted canBeacon label.
Construct the first label matrix of first iteration first, as shown in the above, the first label matrix according to trade company A,The credible label of trade company B and user to be predicted are built-up.Also, label matrix is the matrix of 3*2, wherein label matrixEach row indicates the corresponding credible label information of each trade company;First row and secondary series respectively indicate two classifications of credible label, toolBody, first row indicates that credible label is positive the classification of label, and secondary series indicates that credible label is negative the classification of label.If a certainThe credible label of trade company is positive label, then the corresponding row of the trade company is 1 in the matrix value of first row, arranges (such as secondary series) at otherMatrix value be 0;Conversely, the label if the credible label of a certain trade company is negative, matrix of the corresponding row of the trade company in first rowValue is 0, is 1 in the matrix value of other column (such as secondary series).Therefore, the first label matrix is as follows:
In matrix Y1In can be seen that, trade company A correspond to the first row, since the first row where trade company A is in the square of first rowBattle array value is 1, is 0 in the matrix value of secondary series, therefore the credible label of trade company A is positive label.X, y indicates trade company C's to be predictedCredible label is unknown.
For x, y value, illustrate in above content, the initial labels of user to be predicted and with its highest sample of behavior similarityThe label of this user is consistent.In the present embodiment, it is assumed that the behavior similarity highest of trade company C to be predicted and trade company A, therefore it is to be predictedThe initial labels of trade company C are the label of trade company A, that is, be positive label.First label matrix is as follows:
In the present embodiment, each matrix value of the first label propogator matrix of first iteration is determined as between corresponding trade companyBehavior similarity, due to trade company with itself between behavior similarity be 1, on the diagonal line of the first label propogator matrixMatrix value be 1.Assuming that the behavior similarity between trade company A and trade company B is 0.1, the behavior between trade company A and trade company C is similarDegree is 0.8, and the behavior similarity between trade company B and trade company C is 0.2.So, the first label propogator matrix is as follows:
By calculating the first label propogator matrix T1With the first label matrix Y1Between product, obtain second of iteration pairThe label matrix to be processed answered are as follows:
Above-mentioned matrix is normalized, the label matrix to be processed after being normalized:
For the first matrix value (position i.e. corresponding with trade company A and trade company B in the label matrix to be processed after normalizationThe matrix value at place), the first matrix value of the first row is updated to the value to match with the label of trade company A, by the second of the second rowMatrix value is updated to the value to match with the label of trade company B.
For the second matrix value (matrix i.e. in addition to the first matrix value in the label matrix to be processed after normalizationValue), second matrix value of the third line is updated according to preset first confidence threshold value and/or the second confidence threshold value.Wherein, numerical value of first confidence threshold value between [0.5,1], the sum of the second confidence threshold value and the first confidence threshold value can be1.Assuming that in the present embodiment, the first confidence threshold value is 0.8, and the second confidence threshold value is 0.2, by that will be greater than the in the third lineThe matrix value update of one confidence threshold value 0.8 is positive label, i.e., 1, at the same by the third line less than the second confidence threshold value 0.2Matrix value updates be negative label, i.e., 0, obtains corresponding second label matrix of second of iteration:
By the second label matrix Y2It can be seen that, the label of trade company C to be predicted is positive label, and illustrating that trade company C to be predicted is canBelieve trade company.
Assuming that presetting the condition of convergence are as follows: the label of trade company to be predicted can determine that rate reaches 80% in the present embodiment.It is aobviousSo, the second label matrix Y2In, since the label of trade company C to be predicted has determined that (i.e. positive label), meet default receiptsCondition is held back, then no longer needs to carry out next iteration.Assuming that the second label matrix Y2It is unsatisfactory for the default condition of convergence, then on iterationStep is stated, until the label matrix obtained after iteration meets the default condition of convergence.
To sum up, the specific embodiment of this theme is described.Other embodiments are in the appended claimsIn range.In some cases, the movement recorded in detail in the claims can execute and still in a different orderDesired result may be implemented.In addition, process depicted in the drawing not necessarily requires the particular order shown or continuous suitableSequence, to realize desired result.In some embodiments, multitasking and parallel processing can be advantageous.
The above are the information forecasting methods propagated based on label that this specification one or more embodiment provides, based on sameThe thinking of sample, this specification one or more embodiment also provide a kind of information prediction device propagated based on label.
Fig. 2 is the schematic frame according to a kind of information prediction device propagated based on label of one embodiment of this specificationFigure, as shown in Fig. 2, including: based on the information prediction device 200 that label is propagated
Obtain module 210, the first label of the specified type for obtaining multiple sample of users;And it obtains each sample and usesThe behavioural information of family and user to be predicted;
First determining module 220, for determining the initial labels of user to be predicted according to the first label and behavioural information;
Second determining module 230, for determining the corresponding dynamic labels propagation algorithm of each secondary iteration;And according toOne label, and initial labels are iterated using dynamic labels propagation algorithm;
Third determining module 240, for determining the mark that last time iteration obtains when iteration meets the default condition of convergenceLabel are the second label of the specified type of user to be predicted.
In one embodiment, the first determining module 220 includes:
First determination unit, for determining the behavior phase between each sample of users and user to be predicted according to behavioural informationLike degree;
Screening unit, for from filtering out behavior similarity highest first between user to be predicted in each sample of usersSample of users;
Second determination unit, for determining that corresponding first label of first sample user is the initial mark of user to be predictedLabel.
In one embodiment, the second determining module 230 includes:
Iteration unit is used for iteration following steps, until the label matrix obtained after iteration meets the default condition of convergence:
It determines when time corresponding first label matrix of iteration and the first label propogator matrix;Wherein, first iteration is correspondingLabel matrix is built-up according to the first label and initial labels;
Calculate the product of the first label propogator matrix and the first label matrix, and determine product be next iteration it is corresponding toHandle label matrix;
Label matrix to be processed is normalized;And according to the first label and preset confidence threshold value, to normalizationAt least one matrix value in label matrix to be processed afterwards is updated, and obtains corresponding second label matrix of next iteration;
According to the second label matrix, each matrix value in the first label propogator matrix is updated, next iteration is obtainedCorresponding second label propogator matrix;
According to the second label propogator matrix and the second label matrix, the corresponding third label matrix of lower next iteration is determined.
In one embodiment, iteration unit is also used to:
For the first matrix value corresponding with the first label in the label matrix to be processed after normalization, by the first matrix valueIt is updated to the value to match with the first label;And
For the second matrix value in the label matrix to be processed after normalization in addition to the first matrix value, according to presetConfidence threshold value is updated at least one second matrix value.
In one embodiment, the positive label of the first label and/or negative label;
Iteration unit is also used to:
If the second matrix value is greater than the first confidence threshold value, the second matrix value is updated into the label that is positive;
If the second matrix value updates the label that is negative less than the second confidence threshold value, by the second matrix value;
Wherein, the second confidence threshold value is less than the first confidence threshold value.
In one embodiment, iteration unit is also used to:
In the second label matrix, if the second sample of users label corresponding with the first user to be predicted is identical, increaseMatrix value corresponding with the second sample of users and the first user to be predicted in first label propogator matrix;
If the second sample of users label corresponding from the first user to be predicted is different, reduce in the first label propogator matrixMatrix value corresponding with the second sample of users and the first user to be predicted.
In one embodiment, the default condition of convergence includes at least one of the following:
Second label can determine that rate reaches the first preset threshold in each user to be predicted;
The number of iterations reaches the second preset threshold;
When the label that secondary iteration obtains label with the last time obtains is identical.
In one embodiment, the first label of specified type is credible label.
Using the device of this specification one or more embodiment, can according to the label of the specified type of sample of users withAnd each sample of users and the behavioural information of user to be predicted determine the initial labels of user to be predicted, so that carrying out label biographyDuring broadcasting, the initial labels of the derandominzation of user to be predicted are more bonded the label of sample of users, to be conducive to subtractThe number of iterations during few information prediction, improves the stability and efficiency of information prediction process;In addition, the device is pre- in informationDuring survey, using dynamic labels propagation algorithm, i.e., each iteration uses different label propagation algorithms, therefore can be realNow to the dynamic optimization of algorithm during information prediction, and then be conducive to improve the efficiency of information prediction process;Also, entire letterIt is not necessarily to manual intervention during breath prediction, greatly reduces manual operation bring risk.
It should be understood that before the above-mentioned information prediction device propagated based on label can be used to realizeThe information forecasting method propagated described in text based on label, datail description therein should describe similar with method part above, beIt avoids cumbersome, does not repeat separately herein.
Based on same thinking, it is pre- that this specification one or more embodiment also provides a kind of information propagated based on labelMeasurement equipment, as shown in Figure 3.Based on the information prediction equipment that label is propagated bigger difference can be generated because configuration or performance are differentIt is different, it may include one or more processor 301 and memory 302, can store one or one in memory 302A application program stored above or data.Wherein, memory 302 can be of short duration storage or persistent storage.It is stored in memory302 application program may include one or more modules (diagram is not shown), and each module may include to based on markSign the series of computation machine executable instruction in the information prediction equipment propagated.Further, processor 301 can be set toIt is communicated with memory 302, the series of computation machine executed in memory 302 in the information prediction equipment propagated based on label canIt executes instruction.Based on label propagate information prediction equipment can also include one or more power supplys 303, one or oneThe above wired or wireless network interface 304, one or more input/output interfaces 305, one or more keyboards306。
Specifically in the present embodiment, the information prediction equipment propagated based on label includes memory and one or oneA above program, perhaps more than one program is stored in memory and one or more than one program can for one of themTo include one or more modules, and each module may include to one in the information prediction equipment propagated based on labelFamily computer executable instruction, and be configured to be executed by one or more than one processor this or more than oneProgram includes for carrying out following computer executable instructions:
Obtain the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family;
According to first label and the behavioural information, the initial labels of the user to be predicted are determined;
Determine the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to first label, and described in utilizationDynamic labels propagation algorithm is iterated the initial labels;
When the iteration meets the default condition of convergence, determine that the label that last time iteration obtains is the use to be predictedSecond label of the specified type at family.
Optionally, computer executable instructions when executed, can also make the processor:
According to the behavioural information, the behavior similarity between each sample of users and the user to be predicted is determined;
From filtering out the highest first sample of behavior similarity between the user to be predicted in each sample of usersUser;
Determine that corresponding first label of the first sample user is the initial labels of the user to be predicted.
Optionally, computer executable instructions when executed, can also make the processor:
Iteration following steps, until the label matrix obtained after iteration meets the default condition of convergence:
It determines when time corresponding first label matrix of iteration and the first label propogator matrix;Wherein, first iteration is correspondingLabel matrix is built-up according to first label and the initial labels;
The product of the first label propogator matrix and first label matrix is calculated, and determines that the product is next timeThe corresponding label matrix to be processed of iteration;
The label matrix to be processed is normalized;And according to first label and preset confidence threshold value,At least one matrix value in the label matrix to be processed after normalization is updated, it is corresponding to obtain the next iterationThe second label matrix;
According to second label matrix, each matrix value in the first label propogator matrix is updated, is obtainedThe corresponding second label propogator matrix of iteration next time;
According to the second label propogator matrix and second label matrix, the corresponding third mark of lower next iteration is determinedSign matrix.
Optionally, computer executable instructions when executed, can also make the processor:
For the first matrix value corresponding with first label in the label matrix to be processed after normalization, by instituteIt states the first matrix value and is updated to the value to match with first label;And
For the second matrix value in the label matrix to be processed after normalization in addition to first matrix value, rootAccording to the preset confidence threshold value, at least one described second matrix value is updated.
Optionally, first label includes positive label and/or negative label;
Computer executable instructions when executed, can also make the processor:
If second matrix value is greater than the first confidence threshold value, second matrix value is updated to the positive markLabel;
If second matrix value is updated to the negative mark less than the second confidence threshold value, by second matrix valueLabel;
Wherein, second confidence threshold value is less than first confidence threshold value.
Optionally, computer executable instructions when executed, can also make the processor:
In second label matrix, if the second sample of users label corresponding with the first user to be predicted is identical,Increase matrix value corresponding with second sample of users and first user to be predicted in the first label propogator matrix;
If the second sample of users label corresponding from the first user to be predicted is different, reduces first label and propagate squareMatrix value corresponding with second sample of users and first user to be predicted in battle array.
Optionally, the default condition of convergence includes at least one of the following:
Second label described in each user to be predicted can determine that rate reaches the first preset threshold;
The number of iterations reaches the second preset threshold;
When the label that secondary iteration obtains label with the last time obtains is identical.
Optionally, the first label of the specified type is credible label.
This specification one or more embodiment also proposed a kind of computer readable storage medium, this is computer-readable to depositStorage media stores one or more programs, which includes instruction, and it is included multiple application programs which, which works as,Electronic equipment when executing, the electronic equipment can be made to execute the above-mentioned information forecasting method propagated based on label, and specifically usedIn execution:
Obtain the first label of the specified type of multiple sample of users;And obtain each sample of users and use to be predictedThe behavioural information at family;
According to first label and the behavioural information, the initial labels of the user to be predicted are determined;
Determine the corresponding dynamic labels propagation algorithm of each secondary iteration;And according to first label, and described in utilizationDynamic labels propagation algorithm is iterated the initial labels;
When the iteration meets the default condition of convergence, determine that the label that last time iteration obtains is the use to be predictedSecond label of the specified type at family.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be usedThink personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media playIt is any in device, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipmentThe combination of equipment.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing thisThe function of each unit can be realized in the same or multiple software and or hardware when specification one or more embodiment.
It should be understood by those skilled in the art that, this specification one or more embodiment can provide for method, system orComputer program product.Therefore, complete hardware embodiment can be used in this specification one or more embodiment, complete software is implementedThe form of example or embodiment combining software and hardware aspects.Moreover, this specification one or more embodiment can be used oneIt is a or it is multiple wherein include computer usable program code computer-usable storage medium (including but not limited to disk storageDevice, CD-ROM, optical memory etc.) on the form of computer program product implemented.
This specification one or more embodiment is referring to according to the method for the embodiment of the present application, equipment (system) and meterThe flowchart and/or the block diagram of calculation machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/Or the combination of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It canThese computer program instructions are provided at general purpose computer, special purpose computer, Embedded Processor or other programmable datasThe processor of equipment is managed to generate a machine, so that holding by the processor of computer or other programmable data processing devicesCapable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagramThe device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram orThe function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that countingSeries of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer orThe instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram oneThe step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, netNetwork interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/orThe forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable mediumExample.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any methodOr technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), movesState random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasableProgrammable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devicesOr any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculatesMachine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludabilityIt include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrapInclude other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic wantElement.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described wantThere is also other identical elements in the process, method of element, commodity or equipment.
This specification one or more embodiment can computer executable instructions it is general onIt hereinafter describes, such as program module.Generally, program module includes executing particular task or realization particular abstract data typeRoutine, programs, objects, component, data structure etc..The application can also be practiced in a distributed computing environment, at theseIn distributed computing environment, by executing task by the connected remote processing devices of communication network.In distributed computingIn environment, program module can be located in the local and remote computer storage media including storage equipment.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodimentDividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system realityFor applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the methodPart explanation.
The foregoing is merely this specification one or more embodiments, are not limited to this specification.For thisFor the technical staff of field, this specification one or more embodiment can have various modifications and variations.It is all in this specification oneAny modification, equivalent replacement, improvement and so within the spirit and principle of a or multiple embodiments, should be included in this explanationWithin the scope of the claims of book one or more embodiment.