CN110188198A

Movatterモバイル変換

Info

Publication number: CN110188198A
Application number: CN201910415531.XA
Authority: CN
Inventors: 窦志成; 姜涛; 韩维思; 黄真
Original assignee: Beijing Wisdom Data Technology Co Ltd
Current assignee: Beijing Wisdom Data Technology Co Ltd
Priority date: 2019-05-13
Filing date: 2019-05-13
Publication date: 2019-08-30
Anticipated expiration: 2039-05-13
Also published as: CN110188198B

Abstract

This application discloses a kind of anti-fraud method and devices of knowledge based map, which comprises entity, entity attribute data and relation data are extracted from data source；The entity attribute data are screened and handled, and knowledge mapping is constructed using processed entity attribute data and the relation data, the knowledge mapping includes first kind node and the second class node, the first kind node is the node of known label, and the second class node is the node of label to be predicted；Based on the knowledge mapping, the label of the second class node is predicted.

Description

A kind of anti-fraud method and device of knowledge based map

Technical field

This application involves anti-fraud technology more particularly to a kind of anti-fraud method and devices of knowledge based map.

Background technique

The lendings such as conventional silver industry mechanism it will take a lot of manpower and time in assessment surveys Shen Dai enterprise cost,Cause Corporate finance period length, finance costs high.Containing big in the data that many small business actively declare when applying for loanDeceptive information is measured, causes mechanism that can not correctly assess lending risk.

Summary of the invention

In order to solve the above technical problems, the embodiment of the present application provides the anti-fraud method and dress of a kind of knowledge based mapIt sets.

The anti-fraud method of knowledge based map provided by the embodiments of the present application, comprising:

Entity, entity attribute data and relation data are extracted from data source；

The entity attribute data are screened and handled, and utilize processed entity attribute data and the passFor coefficient according to building knowledge mapping, the knowledge mapping includes first kind node and the second class node, and the first kind node isKnow that the node of label, the second class node are the node of label to be predicted；

Based on the knowledge mapping, the label of the second class node is predicted.

In one embodiment, the entity is enterprise；Correspondingly,

The entity attribute data include company information and individual client's information；

The relation data includes at least one of: the corresponding pass of enterprise and personal corresponding relationship, individual and individualSystem, the corresponding relationship of enterprise and association attributes, personal corresponding relationship, enterprise and the corresponding relationship of enterprise with association attributes.

In one embodiment, before constructing knowledge mapping, the method also includes:

Reduction is carried out to the relation data, so that every relationship corresponds to enterprise.

In one embodiment, individual client's information includes that control people's information and multiple enterprise shareholders believe in fact for enterpriseBreath；

It is described that the entity attribute data are screened and handled, comprising:

After the multiple enterprise shareholder information is polymerize, shareholder's aggregation features are obtained；

Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained；

Carry out following at least one processing to enterprise's sample data: outlier processing, missing values are handled, between variableThe analysis of correlation, class variable coding.

In one embodiment, described to construct knowledge graph using processed entity attribute data and the relation dataSpectrum, comprising:

Using enterprise as the node of knowledge mapping；

The polymerization of people's information and enterprise's shareholder's information is controlled into collectively as node in fact in processed company information, enterpriseAttribute；

Using the reduction relation between enterprise as the relationship of knowledge mapping；

Delete isolated node present in knowledge mapping.

In one embodiment, described to be based on the knowledge mapping, predict the label of the second class node, comprising:

S1: being trained using enterprise attributes feature known to true tag, obtains fraud prediction model local_classifier；

S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again, obtains the fraud prediction model relation_ that neighbours' label information is addedclassifier；

S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, it is general to obtain preliminary risk of fraudThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by rate pos_probability；

S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated isM；N, i, M are positive integer；

S5: the not probability of cheating neg_probability=1-pos_ of each enterprise's node to be predicted is calculatedProbabiliy, and pos_probabiliy and neg_probability are done into difference, confidence confidence is obtained, andThe absolute value of confidence is ranked up；

S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, by the pre- of the forecast sampleEstimate label to be set to just, if confidence is less than or equal to 0, the label of estimating of the forecast sample is set to negative, and writes back and knowsKnow the pred attribute storage in map；

S7: for each enterprise's node to be predicted, calculating the positive sample ratio of once neighbours around it, if surroundingNeighbor node has determined the label of the test node in training sample or last round of, then calculating can be added, by calculated resultIt is spliced to the attribute data of the enterprise；

S8: being classified using relation_classifier in S2, obtains the risk of fraud probability of the node of epicyclePos_probability writes back and updates the attribute value in knowledge mapping；

S9: the number of iterations i+1, iteration S5, S6, S7, S8；Wherein, iteration end mark is i > N or epicycle prediction knotFruit is identical as last round of prediction result.

The anti-rogue device of knowledge based map provided by the embodiments of the present application, comprising:

Extracting unit, for extracting entity, entity attribute data and relation data from data source；

Processing unit, for the entity attribute data to be screened and handled；

Map construction unit, for constructing knowledge graph using processed entity attribute data and the relation dataSpectrum, the knowledge mapping include first kind node and the second class node, and the first kind node is the node of known label, describedSecond class node is the node of label to be predicted；

Predicting unit predicts the label of the second class node for being based on the knowledge mapping.

In one embodiment, the entity is enterprise；Correspondingly,

In one embodiment, described device further include:

Specification unit, for carrying out reduction to the relation data, so that every relationship corresponds to enterprise.

The processing unit obtains shareholder and polymerize spy after polymerizeing the multiple enterprise shareholder informationSign；Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained；To enterprise's sample numberAccording to carrying out following at least one processing: outlier processing, missing values processing, the analysis of correlation between variable, class variableCoding.

In one embodiment, the map construction unit, is used for:

Using enterprise as the node of knowledge mapping；

Delete isolated node present in knowledge mapping.

In one embodiment, the predicting unit, for executing following steps:

The technical solution of the embodiment of the present application can carry out identification filtering to high risk enterprise using business connection map.Enterprise is industrial and commercial, enterprise's responsibility personal data by integrating, while considering that associated enterprise occurs with Target Enterprise, finally by enterprise withAnd incidence relation is depicted between enterprise.The technical solution of the embodiment of the present application helps to identify fraud of forming a team, clique's Concern Mafia, deceiveThe fraud case such as loan, can fully assess the risk status of Shen Dai enterprise, prevent hidden fraud in advance and block loanPath.Other than it can identify enterprise's risk of fraud, a variety of relation maps of building can also to shareholder's share-holding structure, relate to and tell casePart, senior executive's relationship, kinship etc. carry out visual analyzing and excavation.

Detailed description of the invention

Fig. 1 is the flow diagram of the anti-fraud method of knowledge based map provided by the embodiments of the present application；

Fig. 2 is the logic chart of risk probability prediction algorithm provided by the embodiments of the present application；

Fig. 3 is the structure composition schematic diagram of the anti-rogue device of knowledge based map provided by the embodiments of the present application.

Specific embodiment

The technical solution of the embodiment of the present application for ease of understanding below says the relevant technologies of the embodiment of the present applicationIt is bright.

Current credit, which is instead cheated, realizes that route can be summarized as following four big technological means: based on black and white lists, based on ruleThen engine, have the study and unsupervised study of supervision.

Black and white lists are most simple original anti-fraudulent means, new to apply for that client and history blacklist data inquireMatching achievees the purpose that filter screening fraudulent user.

Regulation engine originates from rule-based expert system, for simulating the behavior of people to realize that computer is determined automaticallyPlan, be it is a kind of establish on the basis of sufficiently recognize with mode to fraud the characteristics of, for single or combine fraudThe starting of design and trigger mechanism.

Supervised learning is the machine learning method being most widely used in anti-fraud detection instantly.This method needs to receiveThe known fraud data of collection and normal data are used as training set, and the machine learning model trained passes through the abstract reason to user characteristicsSolution analyzes hidden layer relationship between feature, to fill up and enhance the complicated fraud that regulation engine can not cover.

Unsupervised learning is the anti-fraudulent policies gradually risen in recent years.This kind of detection algorithm is without relying on any labelModel training is carried out, by association analysis and similarity analysis, it is found that the general character between fraudulent user behavior is abnormal, creation clusterGroup, and unknown fraud is excavated in one or more groups.

The following problems exist respectively for above-mentioned four kinds of technological means: 1. black and white lists are although easy to use, but accumulate the used time compared withLong, purchase cost is high.The effect of identification fraudster is limited by blacklist scale and source, and has naturally on time dimensionHysteresis quality, it is difficult in advance containment fraud case.2. the regulation engine based on expertise has the advantages that configuration is simple, but advisesFormulation and update then is based on business experience, there is the risk centainly judged by accident.Although regulation engine can identify new fraudPerson can not but detect new fraud mode.Since the action time of rule is limited, regulation engine need to spend a large amount of operation resources,Time and expense are safeguarded.3. being currently being widely used although interference that supervised learning avoids artificial experience, but collect footEnough training datas and accurate flag data are but that Supervised machine learning increases certain limitation.Most engineeringIt practises the commonly used Logic Regression Models of model, especially financial industry and needs the longer training time, be accordingly difficult to reply and becomeChange the fraud of multiterminal.In addition to this, the number that traditional Supervised machine learning mode is suitable for being independently distributed moreAccording to, i.e., the feature between sample and sample there is no it is interrelated, interdepend the case where.And it is counter cheat in scene, between enterpriseHiding association usually contains unknown potential information, goes to prediction enterprise itself to have supervision to tradition using the information of affiliated enterpriseIt is a challenge for study.4. although unsupervised learning does not need a large amount of artificial determining label processes, but cluster result is stillIt needs business expert that domain knowledge is combined to screen, there is no specific standard and evaluation for the quality of cluster.

To solve the above problems, the following technical scheme of the embodiment of the present application is proposed, the technical side of the embodiment of the present applicationCase is intended to combine disclosed enterprise fraud record, integrates multiple authoritative data sources, actively constructs relation map between enterprise, entirelyUnknown object enterprise management condition is assessed in face, quantifies enterprise's risk of fraud, helps mechanism rapid development air control strategy of making loans.

The characteristics of in order to more fully hereinafter understand the embodiment of the present application and technology contents, with reference to the accompanying drawing to this ShenPlease the realization of embodiment be described in detail, appended attached drawing purposes of discussion only for reference is not used to limit the embodiment of the present application.First the related notion being related in the embodiment of the present application is illustrated below:

Entity: there is distinguishability and self-existent things.The anti-fraud map constructed in the application only includes one kindEntity, i.e. enterprise.

Relationship: the connection between entity.Such as: " with real control people ", " same to telephone number ".

Attribute: attribute is the description to entity and relationship.Entity generally has attribute, such as the industrial and commercial data of enterprise etc..Relationship can also have the weight on attribute, such as relationship.

Once it was associated with (once neighbours): the node being connected directly with destination node.

Enterprise shareholder: real control people, legal person, senior executive, shareholder.

AUM: bank considers one kind of client, measures client to the contribution degree of bank.

LightGBM (Light Gradient Boosting Machine), the realization GBDT (Gradient of an open sourceBoosting Decision Tree) algorithm frame, support efficient parallel training.

Fig. 1 is the flow diagram of the anti-fraud method of knowledge based map provided by the embodiments of the present application, such as Fig. 1 instituteShow, the anti-fraud method of the knowledge based map the following steps are included:

Step 101: entity, entity attribute data and relation data are extracted from data source.

Business data, individual client's data and the outside that all data sources used in the application are provided from mechanismTripartite's data.The extraction of data can be divided into entity, the extraction of attribute and the extraction of relationship.In an optional embodiment, numberAccording to extraction time range determine the business loan application time be in t1 time to the t2 time and have refund performance enterprise in,Such as: the extraction time range of data, which is determined, to be in January, 2018 in December, 2018 in the business loan application time and has alsoIn the enterprise of money performance.

A) the extraction of entity and attribute

The application is modeled with enterprise's granularity, each entity is enterprise.Entity attribute by enterprise itself attributeThe attribute information of information and enterprise correlation individual client collectively form.Wherein, attribute information (the referred to as enterprise of enterprise itselfData or company information) it include but is not limited to the technology number of enterprise, industrial and commercial data, telephone number, registered address, affiliatedThe basic informations such as industry class, the Date of Incorporation and enterprise deposits data, transfer data, loan data.The related personal visitor of enterpriseThe attribute information (referred to as individual client's data or individual client's information) at family includes but is not limited to individual skill number, propertyNot, basic informations and the individual deposit such as age, educational background, residential areas, occupation, post, academic title, marital status, children's situationData, loan data.

B) the extraction of relationship

Tables of data involved in extraction relationship can summarize for following five class: (real control is closed for 1. enterprises and personal corresponding relationshipIt is table, senior executive's relation table, legal person's relation table, share control takeover relation table).2. (lineal relative is closed for people and personal corresponding relationshipIt is table, pair bond table).3. corresponding relationship (the enterprise's address relation table, enterprise telephone number relationship of enterprise and association attributesTable).4. (personal address relation table, personal telephone number relation table, personal device use the corresponding relationship of people and association attributesRelation table).5. the corresponding relationship (enterprise security relation table) of enterprise and enterprise.

In the embodiment of the present application, need to carry out reduction to the relation data, so that every relationship corresponds to enterprise.SpecificallyGround, original business connection information source is more, Heterogeneous data, fragmentation.To guarantee whole business connection Isomorphic net-works, i.e. knowledgeMap entity is unified, and relationship is carried out reduction by the application in the way of such as the following table 1, guarantees that every relationship corresponds to enterprise's sheetBody.

Table 1

Step 102: the entity attribute data are screened and handled.

A) business data

The attribute data of enterprise is from the above-mentioned company information extracted from data source.Specifically include enterprise technology volumeNumber, industrial and commercial data, registered address, affiliated industry class, the Date of Incorporation, basic informations and the enterprise deposits number such as registration dateAccording to, transfer data and loan data.Wherein, enterprise's industry and commerce data include the registered capital amount of money of enterprise, annual test total assets, visitorFamily total profit, sale or operating income, total net assets five.Enterprise deposits data include data cutout moment enterprise depositsRemaining sum and the deposit moon, season, year product.Enterprise's transfer data includes always transfer accounts and (be transferred to, produce) number and total turn in one yearThe account amount of money.Business loan data include that number is borrowed in Shen, Shen borrows and refused number and overdue situation (overdue capital, benefit in one yearBreath, number of days).

B) enterprise's responsibility personal data

Simple insufficient using prediction of enterprise's related data to fraud case, therefore, the application is using enterprise attributes numberAccording to while match real control people and other shareholders for each enterprise relevant information establish the multi-dimensional feature data based on enterprise,Enhance the characterization ability of conceptual data.Each enterprise has unique real control people and other multiple shareholders, and controls people and enterprise in factCorrelation degree compared with other shareholders compared to more close.Therefore, individually by enterprise, the information of control people and company information splice in fact,And enterprise characteristic is further extended after polymerizeing other information for controlling people (i.e. shareholder) in fact of enterprise.Real control peopleInformation and other shareholder's information both from above-mentioned extraction individual client's data, specifically include individual skill number, propertyNot, age, educational background, residential areas, occupation, post, academic title, marital status, children's situation and individual deposit data, loan numberAccording to.It includes data cutout moment individual deposit remaining sum and the deposit moon, season, year product, personal time point, the moon that individual client, which deposits data,Equal and average annual AUM value.Loan data includes that number is borrowed in Shen in one year and overdue situation (is continuously in arrears with issue, overdueGold, interest, maximum are in arrears with number of days).

When polymerizeing to other shareholders of enterprise, the aggregate function selected to different variables includes:

Numeric type variable: choosing maximum value respectively, adduction, median, mean value

Classification type variable: mode is chosen

Finally, processed real control people's feature and shareholder's aggregation features are associated in enterprise, so that a sample numberAccording to (enterprise) corresponding record.It handles later at least one below above-mentioned enterprise's sample data: outlier processing, missingValue processing, the analysis of correlation between variable, class variable coding.Such as: miss rate is up to 80% or more or Pearson cameAfter feature of the coefficient higher than 0.98 is deleted, by residue character collectively as the enterprise attributes feature of model training.

Step 103: constructing knowledge mapping, the knowledge using processed entity attribute data and the relation dataMap includes first kind node and the second class node, and the first kind node is the node of known label, the second class nodeFor the node of label to be predicted.

The also referred to as anti-fraud Company Knowledge map of knowledge mapping in the embodiment of the present application, knowledge mapping specifically constructedJourney is as follows:

A) using enterprise as the node of knowledge mapping.

Specifically, to submit the enterprise of credit applications as the node entities of map, wherein for opening relationships people orObject is fallen in relationship building process by reduction；

B the polymerization of people's information and enterprise's shareholder's information) is controlled into collectively as section in fact in processed company information, enterpriseThe attribute of point.

Specifically, the essential information of enterprise processed in step 102, real control people's essential information and shareholder is basicThe polymerization of information is collectively as entity attributes.

C) using the reduction relation between enterprise as the relationship of knowledge mapping.

Specifically, using the various reduction relations in step 101 between enterprise as the relationship of map.

D isolated node present in knowledge mapping) is deleted.

In the embodiment of the present application, due to lacking the accurate definition to enterprise's fraud in historical data, select according in mechanismThe critical violation record of the enterprise and enterprise shareholder that portion or relevant departments disclose whithin a period of time establishes enterprise's fraud label,And using the label as target variable.Relevant enterprise and personal critical violation data include but is not limited to: 1. in-house fraudsFraud list in system；2. the Administrative Illegality of enterprises and individuals records and suspect's blacklist；3. cross-platform number debt-creditNumber is greater than N, and N is, for example, 4.

Step 104: being based on the knowledge mapping, predict the label of the second class node.

In the embodiment of the present application, be based on the knowledge mapping, predict the label of the second class node can be defined asLower problem: the business connection map constructed based on step 103 is denoted as G, and all nodes are denoted as V on G, and X is enterprise in step 1Self attributes vector, Vk is known label node in G, and Vu is node to be predicted in enterprise's map G.Known map structure G,Attribute X entrained by all nodes (enterprise) and part known label node Vk utilizes the tag types of information above prediction Vu.Corresponding pseudocode is as follows:

Referring to Fig. 2, following steps are may be implemented in above-mentioned pseudocode:

S1: being trained using enterprise attributes feature known to true tag, the LightGBM frame increased income by MicrosoftIt is trained, obtains the fraud prediction model local_classifier for only depending on enterprise's self attributes；

S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again by LightGBM frame, mould is estimated in the fraud for obtaining being added neighbours' label informationType relation_classifier；

It should be noted that the LightGBM algorithm in the embodiment of the present application can be replaced by the machine of any exportable probabilityDevice learning algorithm, including but not limited to Logistic Regression, Random Forest, XGBoost, GBDT scheduling algorithm.

Fig. 3 is the structure composition schematic diagram of the anti-rogue device of knowledge based map provided by the embodiments of the present application, such as Fig. 3Shown, the anti-rogue device of the knowledge based map includes:

Extracting unit 301, for extracting entity, entity attribute data and relation data from data source；

Processing unit 302, for the entity attribute data to be screened and handled；

Map construction unit 303, for constructing knowledge using processed entity attribute data and the relation dataMap, the knowledge mapping include first kind node and the second class node, and the first kind node is the node of known label, instituteState the node that the second class node is label to be predicted；

Predicting unit 304 predicts the label of the second class node for being based on the knowledge mapping.

In one embodiment, the entity is enterprise；Correspondingly,

In one embodiment, described device further include:

Specification unit (not shown), for carrying out reduction to the relation data, so that every relationship corresponds to enterpriseIndustry.

The processing unit 302 obtains shareholder's polymerization after polymerizeing the multiple enterprise shareholder informationFeature；Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained；To enterprise's sampleData carry out following at least one processing: outlier processing, missing values processing, the analysis of correlation, class variable between variableCoding.

In one embodiment, the map construction unit 303, is used for:

Using enterprise as the node of knowledge mapping；

Delete isolated node present in knowledge mapping.

In one embodiment, the predicting unit 304, for executing following steps:

It will be appreciated by those skilled in the art that each unit in the anti-rogue device of knowledge based map shown in Fig. 3Realize that function can refer to the associated description of the anti-fraud method of aforementioned knowledge based map and understand.Knowledge based shown in Fig. 3The function of each unit in the anti-rogue device of map can realize and running on the program on processor, can also be by specificLogic circuit and realize.

It, in the absence of conflict, can be in any combination between technical solution documented by the embodiment of the present application.

In several embodiments provided herein, it should be understood that disclosed method and smart machine, Ke YitongOther modes are crossed to realize.Apparatus embodiments described above are merely indicative, for example, the division of the unit, onlyOnly a kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can be tiedIt closes, or is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each groupCan be through some interfaces at the mutual coupling in part or direct-coupling or communication connection, equipment or unit it is indirectCoupling or communication connection, can be electrical, mechanical or other forms.

Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unitThe component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network listsIn member；Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, can be fully integrated into a second processing unit in each functional unit in each embodiment of the application,It is also possible to each unit individually as a unit, can also be integrated in one unit with two or more units；Above-mentioned integrated unit both can take the form of hardware realization, can also add the form of SFU software functional unit real using hardwareIt is existing.

The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is anyThose familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all containLid is within the scope of protection of this application.

Claims

1. a kind of anti-fraud method of knowledge based map, which is characterized in that the described method includes:

The entity attribute data are screened and handled, and utilize processed entity attribute data and the relationship numberAccording to building knowledge mapping, the knowledge mapping includes first kind node and the second class node, and the first kind node is known markThe node of label, the second class node are the node of label to be predicted；

2. the method according to claim 1, wherein the entity is enterprise；Correspondingly,

The relation data includes at least one of: enterprise and personal corresponding relationship, the personal corresponding relationship with individual, enterpriseThe corresponding relationship of industry and association attributes, personal corresponding relationship, enterprise and the corresponding relationship of enterprise with association attributes.

3. according to the method described in claim 2, it is characterized in that, individual client's information include enterprise in fact control people's information withMultiple enterprise shareholder information；

Carry out following at least one to enterprise's sample data to handle: outlier processing, missing values handle, are related between variableThe analysis of property, class variable coding.

4. method according to claim 1 or 3, which is characterized in that it is described using processed entity attribute data andThe relation data constructs knowledge mapping, comprising:

Using enterprise as the node of knowledge mapping；

By the polymerization of processed company information, enterprise's reality control people's information and enterprise's shareholder's information collectively as the category of nodeProperty；

Delete isolated node present in knowledge mapping.

5. predicting second class the method according to claim 1, wherein described be based on the knowledge mappingThe label of node, comprising:

S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according to the node in knowledge mappingAssociation calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced to enterpriseIt after attributive character, is trained again, obtains the fraud prediction model relation_ that neighbours' label information is addedclassifier；

S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, preliminary risk of fraud probability is obtainedThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by pos_probability；

S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated is M；N,I, M are positive integer；

S5: calculating the not probability of cheating neg_probability=1-pos_probabiliy of each enterprise's node to be predicted, andPos_probabiliy and neg_probability are done into difference, obtain confidence confidence, and by confidence'sAbsolute value is ranked up；

S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, which is estimated markLabel are set to just, if confidence is less than or equal to 0, is set to negative by the label of estimating of the forecast sample, and is write back knowledge graphPred attribute storage in spectrum；

S7: for each enterprise's node to be predicted, the positive sample ratio of once neighbours around it is calculated, if surrounding neighboursNode has determined the label of the test node in training sample or last round of, then calculating can be added, calculated result is splicedTo the attribute data of the enterprise；

S8: being classified using relation_classifier in S2, obtains the risk of fraud Probability p os_ of the node of epicycleProbability writes back and updates the attribute value in knowledge mapping；

S9: the number of iterations i+1, iteration S5, S6, S7, S8；Wherein, iteration end mark be i > N or epicycle prediction result withLast round of prediction result is identical.

6. a kind of anti-rogue device of knowledge based map, which is characterized in that described device includes:

Processing unit, for the entity attribute data to be screened and handled；

Map construction unit, for constructing knowledge mapping, institute using processed entity attribute data and the relation dataStating knowledge mapping includes first kind node and the second class node, and the first kind node is the node of known label, and described secondClass node is the node of label to be predicted；

7. device according to claim 6, wherein the entity is enterprise；Correspondingly,

8. device according to claim 7, wherein individual client's information includes enterprise control people's information and multiple enterprises in factIndustry shareholder's information；

The processing unit obtains shareholder's aggregation features after polymerizeing the multiple enterprise shareholder information；It willReal control people's feature and shareholder's aggregation features are associated in enterprise, obtain enterprise's sample data；To enterprise's sample data intoThe following at least one processing of row: outlier processing, missing values processing, the analysis of correlation between variable, class variable coding.

9. the device according to claim 6 or 8, wherein the map construction unit is used for:

Using enterprise as the node of knowledge mapping；

Delete isolated node present in knowledge mapping.

10. device according to claim 6, wherein the predicting unit, for executing following steps: