Movatterモバイル変換


[0]ホーム

URL:


CN110188198A - A kind of anti-fraud method and device of knowledge based map - Google Patents

A kind of anti-fraud method and device of knowledge based map
Download PDF

Info

Publication number
CN110188198A
CN110188198ACN201910415531.XACN201910415531ACN110188198ACN 110188198 ACN110188198 ACN 110188198ACN 201910415531 ACN201910415531 ACN 201910415531ACN 110188198 ACN110188198 ACN 110188198A
Authority
CN
China
Prior art keywords
enterprise
node
data
knowledge mapping
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910415531.XA
Other languages
Chinese (zh)
Other versions
CN110188198B (en
Inventor
窦志成
姜涛
韩维思
黄真
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wisdom Data Technology Co Ltd
Original Assignee
Beijing Wisdom Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wisdom Data Technology Co LtdfiledCriticalBeijing Wisdom Data Technology Co Ltd
Priority to CN201910415531.XApriorityCriticalpatent/CN110188198B/en
Publication of CN110188198ApublicationCriticalpatent/CN110188198A/en
Application grantedgrantedCritical
Publication of CN110188198BpublicationCriticalpatent/CN110188198B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

This application discloses a kind of anti-fraud method and devices of knowledge based map, which comprises entity, entity attribute data and relation data are extracted from data source;The entity attribute data are screened and handled, and knowledge mapping is constructed using processed entity attribute data and the relation data, the knowledge mapping includes first kind node and the second class node, the first kind node is the node of known label, and the second class node is the node of label to be predicted;Based on the knowledge mapping, the label of the second class node is predicted.

Description

A kind of anti-fraud method and device of knowledge based map
Technical field
This application involves anti-fraud technology more particularly to a kind of anti-fraud method and devices of knowledge based map.
Background technique
The lendings such as conventional silver industry mechanism it will take a lot of manpower and time in assessment surveys Shen Dai enterprise cost,Cause Corporate finance period length, finance costs high.Containing big in the data that many small business actively declare when applying for loanDeceptive information is measured, causes mechanism that can not correctly assess lending risk.
Summary of the invention
In order to solve the above technical problems, the embodiment of the present application provides the anti-fraud method and dress of a kind of knowledge based mapIt sets.
The anti-fraud method of knowledge based map provided by the embodiments of the present application, comprising:
Entity, entity attribute data and relation data are extracted from data source;
The entity attribute data are screened and handled, and utilize processed entity attribute data and the passFor coefficient according to building knowledge mapping, the knowledge mapping includes first kind node and the second class node, and the first kind node isKnow that the node of label, the second class node are the node of label to be predicted;
Based on the knowledge mapping, the label of the second class node is predicted.
In one embodiment, the entity is enterprise;Correspondingly,
The entity attribute data include company information and individual client's information;
The relation data includes at least one of: the corresponding pass of enterprise and personal corresponding relationship, individual and individualSystem, the corresponding relationship of enterprise and association attributes, personal corresponding relationship, enterprise and the corresponding relationship of enterprise with association attributes.
In one embodiment, before constructing knowledge mapping, the method also includes:
Reduction is carried out to the relation data, so that every relationship corresponds to enterprise.
In one embodiment, individual client's information includes that control people's information and multiple enterprise shareholders believe in fact for enterpriseBreath;
It is described that the entity attribute data are screened and handled, comprising:
After the multiple enterprise shareholder information is polymerize, shareholder's aggregation features are obtained;
Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained;
Carry out following at least one processing to enterprise's sample data: outlier processing, missing values are handled, between variableThe analysis of correlation, class variable coding.
In one embodiment, described to construct knowledge graph using processed entity attribute data and the relation dataSpectrum, comprising:
Using enterprise as the node of knowledge mapping;
The polymerization of people's information and enterprise's shareholder's information is controlled into collectively as node in fact in processed company information, enterpriseAttribute;
Using the reduction relation between enterprise as the relationship of knowledge mapping;
Delete isolated node present in knowledge mapping.
In one embodiment, described to be based on the knowledge mapping, predict the label of the second class node, comprising:
S1: being trained using enterprise attributes feature known to true tag, obtains fraud prediction model local_classifier;
S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again, obtains the fraud prediction model relation_ that neighbours' label information is addedclassifier;
S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, it is general to obtain preliminary risk of fraudThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by rate pos_probability;
S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated isM;N, i, M are positive integer;
S5: the not probability of cheating neg_probability=1-pos_ of each enterprise's node to be predicted is calculatedProbabiliy, and pos_probabiliy and neg_probability are done into difference, confidence confidence is obtained, andThe absolute value of confidence is ranked up;
S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, by the pre- of the forecast sampleEstimate label to be set to just, if confidence is less than or equal to 0, the label of estimating of the forecast sample is set to negative, and writes back and knowsKnow the pred attribute storage in map;
S7: for each enterprise's node to be predicted, calculating the positive sample ratio of once neighbours around it, if surroundingNeighbor node has determined the label of the test node in training sample or last round of, then calculating can be added, by calculated resultIt is spliced to the attribute data of the enterprise;
S8: being classified using relation_classifier in S2, obtains the risk of fraud probability of the node of epicyclePos_probability writes back and updates the attribute value in knowledge mapping;
S9: the number of iterations i+1, iteration S5, S6, S7, S8;Wherein, iteration end mark is i > N or epicycle prediction knotFruit is identical as last round of prediction result.
The anti-rogue device of knowledge based map provided by the embodiments of the present application, comprising:
Extracting unit, for extracting entity, entity attribute data and relation data from data source;
Processing unit, for the entity attribute data to be screened and handled;
Map construction unit, for constructing knowledge graph using processed entity attribute data and the relation dataSpectrum, the knowledge mapping include first kind node and the second class node, and the first kind node is the node of known label, describedSecond class node is the node of label to be predicted;
Predicting unit predicts the label of the second class node for being based on the knowledge mapping.
In one embodiment, the entity is enterprise;Correspondingly,
The entity attribute data include company information and individual client's information;
The relation data includes at least one of: the corresponding pass of enterprise and personal corresponding relationship, individual and individualSystem, the corresponding relationship of enterprise and association attributes, personal corresponding relationship, enterprise and the corresponding relationship of enterprise with association attributes.
In one embodiment, described device further include:
Specification unit, for carrying out reduction to the relation data, so that every relationship corresponds to enterprise.
In one embodiment, individual client's information includes that control people's information and multiple enterprise shareholders believe in fact for enterpriseBreath;
The processing unit obtains shareholder and polymerize spy after polymerizeing the multiple enterprise shareholder informationSign;Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained;To enterprise's sample numberAccording to carrying out following at least one processing: outlier processing, missing values processing, the analysis of correlation between variable, class variableCoding.
In one embodiment, the map construction unit, is used for:
Using enterprise as the node of knowledge mapping;
The polymerization of people's information and enterprise's shareholder's information is controlled into collectively as node in fact in processed company information, enterpriseAttribute;
Using the reduction relation between enterprise as the relationship of knowledge mapping;
Delete isolated node present in knowledge mapping.
In one embodiment, the predicting unit, for executing following steps:
S1: being trained using enterprise attributes feature known to true tag, obtains fraud prediction model local_classifier;
S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again, obtains the fraud prediction model relation_ that neighbours' label information is addedclassifier;
S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, it is general to obtain preliminary risk of fraudThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by rate pos_probability;
S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated isM;N, i, M are positive integer;
S5: the not probability of cheating neg_probability=1-pos_ of each enterprise's node to be predicted is calculatedProbabiliy, and pos_probabiliy and neg_probability are done into difference, confidence confidence is obtained, andThe absolute value of confidence is ranked up;
S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, by the pre- of the forecast sampleEstimate label to be set to just, if confidence is less than or equal to 0, the label of estimating of the forecast sample is set to negative, and writes back and knowsKnow the pred attribute storage in map;
S7: for each enterprise's node to be predicted, calculating the positive sample ratio of once neighbours around it, if surroundingNeighbor node has determined the label of the test node in training sample or last round of, then calculating can be added, by calculated resultIt is spliced to the attribute data of the enterprise;
S8: being classified using relation_classifier in S2, obtains the risk of fraud probability of the node of epicyclePos_probability writes back and updates the attribute value in knowledge mapping;
S9: the number of iterations i+1, iteration S5, S6, S7, S8;Wherein, iteration end mark is i > N or epicycle prediction knotFruit is identical as last round of prediction result.
The technical solution of the embodiment of the present application can carry out identification filtering to high risk enterprise using business connection map.Enterprise is industrial and commercial, enterprise's responsibility personal data by integrating, while considering that associated enterprise occurs with Target Enterprise, finally by enterprise withAnd incidence relation is depicted between enterprise.The technical solution of the embodiment of the present application helps to identify fraud of forming a team, clique's Concern Mafia, deceiveThe fraud case such as loan, can fully assess the risk status of Shen Dai enterprise, prevent hidden fraud in advance and block loanPath.Other than it can identify enterprise's risk of fraud, a variety of relation maps of building can also to shareholder's share-holding structure, relate to and tell casePart, senior executive's relationship, kinship etc. carry out visual analyzing and excavation.
Detailed description of the invention
Fig. 1 is the flow diagram of the anti-fraud method of knowledge based map provided by the embodiments of the present application;
Fig. 2 is the logic chart of risk probability prediction algorithm provided by the embodiments of the present application;
Fig. 3 is the structure composition schematic diagram of the anti-rogue device of knowledge based map provided by the embodiments of the present application.
Specific embodiment
The technical solution of the embodiment of the present application for ease of understanding below says the relevant technologies of the embodiment of the present applicationIt is bright.
Current credit, which is instead cheated, realizes that route can be summarized as following four big technological means: based on black and white lists, based on ruleThen engine, have the study and unsupervised study of supervision.
Black and white lists are most simple original anti-fraudulent means, new to apply for that client and history blacklist data inquireMatching achievees the purpose that filter screening fraudulent user.
Regulation engine originates from rule-based expert system, for simulating the behavior of people to realize that computer is determined automaticallyPlan, be it is a kind of establish on the basis of sufficiently recognize with mode to fraud the characteristics of, for single or combine fraudThe starting of design and trigger mechanism.
Supervised learning is the machine learning method being most widely used in anti-fraud detection instantly.This method needs to receiveThe known fraud data of collection and normal data are used as training set, and the machine learning model trained passes through the abstract reason to user characteristicsSolution analyzes hidden layer relationship between feature, to fill up and enhance the complicated fraud that regulation engine can not cover.
Unsupervised learning is the anti-fraudulent policies gradually risen in recent years.This kind of detection algorithm is without relying on any labelModel training is carried out, by association analysis and similarity analysis, it is found that the general character between fraudulent user behavior is abnormal, creation clusterGroup, and unknown fraud is excavated in one or more groups.
The following problems exist respectively for above-mentioned four kinds of technological means: 1. black and white lists are although easy to use, but accumulate the used time compared withLong, purchase cost is high.The effect of identification fraudster is limited by blacklist scale and source, and has naturally on time dimensionHysteresis quality, it is difficult in advance containment fraud case.2. the regulation engine based on expertise has the advantages that configuration is simple, but advisesFormulation and update then is based on business experience, there is the risk centainly judged by accident.Although regulation engine can identify new fraudPerson can not but detect new fraud mode.Since the action time of rule is limited, regulation engine need to spend a large amount of operation resources,Time and expense are safeguarded.3. being currently being widely used although interference that supervised learning avoids artificial experience, but collect footEnough training datas and accurate flag data are but that Supervised machine learning increases certain limitation.Most engineeringIt practises the commonly used Logic Regression Models of model, especially financial industry and needs the longer training time, be accordingly difficult to reply and becomeChange the fraud of multiterminal.In addition to this, the number that traditional Supervised machine learning mode is suitable for being independently distributed moreAccording to, i.e., the feature between sample and sample there is no it is interrelated, interdepend the case where.And it is counter cheat in scene, between enterpriseHiding association usually contains unknown potential information, goes to prediction enterprise itself to have supervision to tradition using the information of affiliated enterpriseIt is a challenge for study.4. although unsupervised learning does not need a large amount of artificial determining label processes, but cluster result is stillIt needs business expert that domain knowledge is combined to screen, there is no specific standard and evaluation for the quality of cluster.
To solve the above problems, the following technical scheme of the embodiment of the present application is proposed, the technical side of the embodiment of the present applicationCase is intended to combine disclosed enterprise fraud record, integrates multiple authoritative data sources, actively constructs relation map between enterprise, entirelyUnknown object enterprise management condition is assessed in face, quantifies enterprise's risk of fraud, helps mechanism rapid development air control strategy of making loans.
The characteristics of in order to more fully hereinafter understand the embodiment of the present application and technology contents, with reference to the accompanying drawing to this ShenPlease the realization of embodiment be described in detail, appended attached drawing purposes of discussion only for reference is not used to limit the embodiment of the present application.First the related notion being related in the embodiment of the present application is illustrated below:
Entity: there is distinguishability and self-existent things.The anti-fraud map constructed in the application only includes one kindEntity, i.e. enterprise.
Relationship: the connection between entity.Such as: " with real control people ", " same to telephone number ".
Attribute: attribute is the description to entity and relationship.Entity generally has attribute, such as the industrial and commercial data of enterprise etc..Relationship can also have the weight on attribute, such as relationship.
Once it was associated with (once neighbours): the node being connected directly with destination node.
Enterprise shareholder: real control people, legal person, senior executive, shareholder.
AUM: bank considers one kind of client, measures client to the contribution degree of bank.
LightGBM (Light Gradient Boosting Machine), the realization GBDT (Gradient of an open sourceBoosting Decision Tree) algorithm frame, support efficient parallel training.
Fig. 1 is the flow diagram of the anti-fraud method of knowledge based map provided by the embodiments of the present application, such as Fig. 1 instituteShow, the anti-fraud method of the knowledge based map the following steps are included:
Step 101: entity, entity attribute data and relation data are extracted from data source.
Business data, individual client's data and the outside that all data sources used in the application are provided from mechanismTripartite's data.The extraction of data can be divided into entity, the extraction of attribute and the extraction of relationship.In an optional embodiment, numberAccording to extraction time range determine the business loan application time be in t1 time to the t2 time and have refund performance enterprise in,Such as: the extraction time range of data, which is determined, to be in January, 2018 in December, 2018 in the business loan application time and has alsoIn the enterprise of money performance.
A) the extraction of entity and attribute
The application is modeled with enterprise's granularity, each entity is enterprise.Entity attribute by enterprise itself attributeThe attribute information of information and enterprise correlation individual client collectively form.Wherein, attribute information (the referred to as enterprise of enterprise itselfData or company information) it include but is not limited to the technology number of enterprise, industrial and commercial data, telephone number, registered address, affiliatedThe basic informations such as industry class, the Date of Incorporation and enterprise deposits data, transfer data, loan data.The related personal visitor of enterpriseThe attribute information (referred to as individual client's data or individual client's information) at family includes but is not limited to individual skill number, propertyNot, basic informations and the individual deposit such as age, educational background, residential areas, occupation, post, academic title, marital status, children's situationData, loan data.
B) the extraction of relationship
Tables of data involved in extraction relationship can summarize for following five class: (real control is closed for 1. enterprises and personal corresponding relationshipIt is table, senior executive's relation table, legal person's relation table, share control takeover relation table).2. (lineal relative is closed for people and personal corresponding relationshipIt is table, pair bond table).3. corresponding relationship (the enterprise's address relation table, enterprise telephone number relationship of enterprise and association attributesTable).4. (personal address relation table, personal telephone number relation table, personal device use the corresponding relationship of people and association attributesRelation table).5. the corresponding relationship (enterprise security relation table) of enterprise and enterprise.
In the embodiment of the present application, need to carry out reduction to the relation data, so that every relationship corresponds to enterprise.SpecificallyGround, original business connection information source is more, Heterogeneous data, fragmentation.To guarantee whole business connection Isomorphic net-works, i.e. knowledgeMap entity is unified, and relationship is carried out reduction by the application in the way of such as the following table 1, guarantees that every relationship corresponds to enterprise's sheetBody.
Table 1
Step 102: the entity attribute data are screened and handled.
A) business data
The attribute data of enterprise is from the above-mentioned company information extracted from data source.Specifically include enterprise technology volumeNumber, industrial and commercial data, registered address, affiliated industry class, the Date of Incorporation, basic informations and the enterprise deposits number such as registration dateAccording to, transfer data and loan data.Wherein, enterprise's industry and commerce data include the registered capital amount of money of enterprise, annual test total assets, visitorFamily total profit, sale or operating income, total net assets five.Enterprise deposits data include data cutout moment enterprise depositsRemaining sum and the deposit moon, season, year product.Enterprise's transfer data includes always transfer accounts and (be transferred to, produce) number and total turn in one yearThe account amount of money.Business loan data include that number is borrowed in Shen, Shen borrows and refused number and overdue situation (overdue capital, benefit in one yearBreath, number of days).
B) enterprise's responsibility personal data
Simple insufficient using prediction of enterprise's related data to fraud case, therefore, the application is using enterprise attributes numberAccording to while match real control people and other shareholders for each enterprise relevant information establish the multi-dimensional feature data based on enterprise,Enhance the characterization ability of conceptual data.Each enterprise has unique real control people and other multiple shareholders, and controls people and enterprise in factCorrelation degree compared with other shareholders compared to more close.Therefore, individually by enterprise, the information of control people and company information splice in fact,And enterprise characteristic is further extended after polymerizeing other information for controlling people (i.e. shareholder) in fact of enterprise.Real control peopleInformation and other shareholder's information both from above-mentioned extraction individual client's data, specifically include individual skill number, propertyNot, age, educational background, residential areas, occupation, post, academic title, marital status, children's situation and individual deposit data, loan numberAccording to.It includes data cutout moment individual deposit remaining sum and the deposit moon, season, year product, personal time point, the moon that individual client, which deposits data,Equal and average annual AUM value.Loan data includes that number is borrowed in Shen in one year and overdue situation (is continuously in arrears with issue, overdueGold, interest, maximum are in arrears with number of days).
When polymerizeing to other shareholders of enterprise, the aggregate function selected to different variables includes:
Numeric type variable: choosing maximum value respectively, adduction, median, mean value
Classification type variable: mode is chosen
Finally, processed real control people's feature and shareholder's aggregation features are associated in enterprise, so that a sample numberAccording to (enterprise) corresponding record.It handles later at least one below above-mentioned enterprise's sample data: outlier processing, missingValue processing, the analysis of correlation between variable, class variable coding.Such as: miss rate is up to 80% or more or Pearson cameAfter feature of the coefficient higher than 0.98 is deleted, by residue character collectively as the enterprise attributes feature of model training.
Step 103: constructing knowledge mapping, the knowledge using processed entity attribute data and the relation dataMap includes first kind node and the second class node, and the first kind node is the node of known label, the second class nodeFor the node of label to be predicted.
The also referred to as anti-fraud Company Knowledge map of knowledge mapping in the embodiment of the present application, knowledge mapping specifically constructedJourney is as follows:
A) using enterprise as the node of knowledge mapping.
Specifically, to submit the enterprise of credit applications as the node entities of map, wherein for opening relationships people orObject is fallen in relationship building process by reduction;
B the polymerization of people's information and enterprise's shareholder's information) is controlled into collectively as section in fact in processed company information, enterpriseThe attribute of point.
Specifically, the essential information of enterprise processed in step 102, real control people's essential information and shareholder is basicThe polymerization of information is collectively as entity attributes.
C) using the reduction relation between enterprise as the relationship of knowledge mapping.
Specifically, using the various reduction relations in step 101 between enterprise as the relationship of map.
D isolated node present in knowledge mapping) is deleted.
In the embodiment of the present application, due to lacking the accurate definition to enterprise's fraud in historical data, select according in mechanismThe critical violation record of the enterprise and enterprise shareholder that portion or relevant departments disclose whithin a period of time establishes enterprise's fraud label,And using the label as target variable.Relevant enterprise and personal critical violation data include but is not limited to: 1. in-house fraudsFraud list in system;2. the Administrative Illegality of enterprises and individuals records and suspect's blacklist;3. cross-platform number debt-creditNumber is greater than N, and N is, for example, 4.
Step 104: being based on the knowledge mapping, predict the label of the second class node.
In the embodiment of the present application, be based on the knowledge mapping, predict the label of the second class node can be defined asLower problem: the business connection map constructed based on step 103 is denoted as G, and all nodes are denoted as V on G, and X is enterprise in step 1Self attributes vector, Vk is known label node in G, and Vu is node to be predicted in enterprise's map G.Known map structure G,Attribute X entrained by all nodes (enterprise) and part known label node Vk utilizes the tag types of information above prediction Vu.Corresponding pseudocode is as follows:
Referring to Fig. 2, following steps are may be implemented in above-mentioned pseudocode:
S1: being trained using enterprise attributes feature known to true tag, the LightGBM frame increased income by MicrosoftIt is trained, obtains the fraud prediction model local_classifier for only depending on enterprise's self attributes;
S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again by LightGBM frame, mould is estimated in the fraud for obtaining being added neighbours' label informationType relation_classifier;
S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, it is general to obtain preliminary risk of fraudThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by rate pos_probability;
S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated isM;N, i, M are positive integer;
S5: the not probability of cheating neg_probability=1-pos_ of each enterprise's node to be predicted is calculatedProbabiliy, and pos_probabiliy and neg_probability are done into difference, confidence confidence is obtained, andThe absolute value of confidence is ranked up;
S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, by the pre- of the forecast sampleEstimate label to be set to just, if confidence is less than or equal to 0, the label of estimating of the forecast sample is set to negative, and writes back and knowsKnow the pred attribute storage in map;
S7: for each enterprise's node to be predicted, calculating the positive sample ratio of once neighbours around it, if surroundingNeighbor node has determined the label of the test node in training sample or last round of, then calculating can be added, by calculated resultIt is spliced to the attribute data of the enterprise;
S8: being classified using relation_classifier in S2, obtains the risk of fraud probability of the node of epicyclePos_probability writes back and updates the attribute value in knowledge mapping;
S9: the number of iterations i+1, iteration S5, S6, S7, S8;Wherein, iteration end mark is i > N or epicycle prediction knotFruit is identical as last round of prediction result.
It should be noted that the LightGBM algorithm in the embodiment of the present application can be replaced by the machine of any exportable probabilityDevice learning algorithm, including but not limited to Logistic Regression, Random Forest, XGBoost, GBDT scheduling algorithm.
Fig. 3 is the structure composition schematic diagram of the anti-rogue device of knowledge based map provided by the embodiments of the present application, such as Fig. 3Shown, the anti-rogue device of the knowledge based map includes:
Extracting unit 301, for extracting entity, entity attribute data and relation data from data source;
Processing unit 302, for the entity attribute data to be screened and handled;
Map construction unit 303, for constructing knowledge using processed entity attribute data and the relation dataMap, the knowledge mapping include first kind node and the second class node, and the first kind node is the node of known label, instituteState the node that the second class node is label to be predicted;
Predicting unit 304 predicts the label of the second class node for being based on the knowledge mapping.
In one embodiment, the entity is enterprise;Correspondingly,
The entity attribute data include company information and individual client's information;
The relation data includes at least one of: the corresponding pass of enterprise and personal corresponding relationship, individual and individualSystem, the corresponding relationship of enterprise and association attributes, personal corresponding relationship, enterprise and the corresponding relationship of enterprise with association attributes.
In one embodiment, described device further include:
Specification unit (not shown), for carrying out reduction to the relation data, so that every relationship corresponds to enterpriseIndustry.
In one embodiment, individual client's information includes that control people's information and multiple enterprise shareholders believe in fact for enterpriseBreath;
The processing unit 302 obtains shareholder's polymerization after polymerizeing the multiple enterprise shareholder informationFeature;Real control people's feature and shareholder's aggregation features are associated in enterprise, enterprise's sample data is obtained;To enterprise's sampleData carry out following at least one processing: outlier processing, missing values processing, the analysis of correlation, class variable between variableCoding.
In one embodiment, the map construction unit 303, is used for:
Using enterprise as the node of knowledge mapping;
The polymerization of people's information and enterprise's shareholder's information is controlled into collectively as node in fact in processed company information, enterpriseAttribute;
Using the reduction relation between enterprise as the relationship of knowledge mapping;
Delete isolated node present in knowledge mapping.
In one embodiment, the predicting unit 304, for executing following steps:
S1: being trained using enterprise attributes feature known to true tag, obtains fraud prediction model local_classifier;
S2: enterprise attributes eigenmatrix known to true tag is extracted from knowledge mapping, and according in knowledge mappingNode association calculates the ratio of the positive sample in the once neighbours for each enterprise's node that training data is concentrated, which is spliced toIt after enterprise attributes feature, is trained again, obtains the fraud prediction model relation_ that neighbours' label information is addedclassifier;
S3: by the model of training in the attributive character input S1 of the unknown enterprise of label, it is general to obtain preliminary risk of fraudThe probability value is equally used as the attribute of enterprise's node to be predicted in knowledge mapping to store by rate pos_probability;
S4: being arranged predefined greatest iteration wheel number N, and the number of iterations i is initialized as 1, and enterprise's node number to be estimated isM;N, i, M are positive integer;
S5: the not probability of cheating neg_probability=1-pos_ of each enterprise's node to be predicted is calculatedProbabiliy, and pos_probabiliy and neg_probability are done into difference, confidence confidence is obtained, andThe absolute value of confidence is ranked up;
S6: i*M/N enterprise presorts before selecting, if confidence is greater than 0, by the pre- of the forecast sampleEstimate label to be set to just, if confidence is less than or equal to 0, the label of estimating of the forecast sample is set to negative, and writes back and knowsKnow the pred attribute storage in map;
S7: for each enterprise's node to be predicted, calculating the positive sample ratio of once neighbours around it, if surroundingNeighbor node has determined the label of the test node in training sample or last round of, then calculating can be added, by calculated resultIt is spliced to the attribute data of the enterprise;
S8: being classified using relation_classifier in S2, obtains the risk of fraud probability of the node of epicyclePos_probability writes back and updates the attribute value in knowledge mapping;
S9: the number of iterations i+1, iteration S5, S6, S7, S8;Wherein, iteration end mark is i > N or epicycle prediction knotFruit is identical as last round of prediction result.
It will be appreciated by those skilled in the art that each unit in the anti-rogue device of knowledge based map shown in Fig. 3Realize that function can refer to the associated description of the anti-fraud method of aforementioned knowledge based map and understand.Knowledge based shown in Fig. 3The function of each unit in the anti-rogue device of map can realize and running on the program on processor, can also be by specificLogic circuit and realize.
It, in the absence of conflict, can be in any combination between technical solution documented by the embodiment of the present application.
In several embodiments provided herein, it should be understood that disclosed method and smart machine, Ke YitongOther modes are crossed to realize.Apparatus embodiments described above are merely indicative, for example, the division of the unit, onlyOnly a kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can be tiedIt closes, or is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each groupCan be through some interfaces at the mutual coupling in part or direct-coupling or communication connection, equipment or unit it is indirectCoupling or communication connection, can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unitThe component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network listsIn member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, can be fully integrated into a second processing unit in each functional unit in each embodiment of the application,It is also possible to each unit individually as a unit, can also be integrated in one unit with two or more units;Above-mentioned integrated unit both can take the form of hardware realization, can also add the form of SFU software functional unit real using hardwareIt is existing.
The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is anyThose familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all containLid is within the scope of protection of this application.

Claims (10)

CN201910415531.XA2019-05-132019-05-13Anti-fraud method and device based on knowledge graphActiveCN110188198B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910415531.XACN110188198B (en)2019-05-132019-05-13Anti-fraud method and device based on knowledge graph

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910415531.XACN110188198B (en)2019-05-132019-05-13Anti-fraud method and device based on knowledge graph

Publications (2)

Publication NumberPublication Date
CN110188198Atrue CN110188198A (en)2019-08-30
CN110188198B CN110188198B (en)2021-06-22

Family

ID=67716779

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910415531.XAActiveCN110188198B (en)2019-05-132019-05-13Anti-fraud method and device based on knowledge graph

Country Status (1)

CountryLink
CN (1)CN110188198B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110704620A (en)*2019-09-252020-01-17海信集团有限公司Method and device for identifying same entity based on knowledge graph
CN110751557A (en)*2019-10-102020-02-04中国建设银行股份有限公司Abnormal fund transaction behavior analysis method and system based on sequence model
CN110765117A (en)*2019-09-302020-02-07中国建设银行股份有限公司Fraud identification method and device, electronic equipment and computer-readable storage medium
CN110909129A (en)*2019-11-142020-03-24上海秒针网络科技有限公司Abnormal complaint event identification method and device
CN111056258A (en)*2019-11-202020-04-24秒针信息技术有限公司Method and device for intelligently adjusting conveyor belt
CN111160847A (en)*2019-12-092020-05-15中国建设银行股份有限公司Method and device for processing flow information
CN111178615A (en)*2019-12-242020-05-19成都数联铭品科技有限公司Construction method and system of enterprise risk identification model
CN111191039A (en)*2019-09-302020-05-22腾讯科技(深圳)有限公司Knowledge graph creation method, knowledge graph creation device and computer readable storage medium
CN111340546A (en)*2020-02-252020-06-26中信银行股份有限公司Method, device, computer equipment and readable storage medium for improving marketing efficiency of banking business
CN111507543A (en)*2020-05-282020-08-07支付宝(杭州)信息技术有限公司Model training method and device for predicting business relation between entities
CN111984798A (en)*2020-09-272020-11-24拉卡拉支付股份有限公司 Atlas data preprocessing method and device
CN112200583A (en)*2020-10-282021-01-08交通银行股份有限公司Knowledge graph-based fraud client identification method
CN112990369A (en)*2021-04-262021-06-18四川新网银行股份有限公司Social network-based method and system for identifying waste escaping and debt behaviors
TWI736233B (en)*2020-04-232021-08-11兆豐國際商業銀行股份有限公司Pre-loan investigation system and pre-loan investigation method
CN113449114A (en)*2020-12-312021-09-28中国科学技术大学智慧城市研究院(芜湖)Method for constructing natural human life cycle holographic image based on knowledge graph
CN113807723A (en)*2021-09-242021-12-17重庆富民银行股份有限公司Risk identification method for knowledge graph
CN114064939A (en)*2022-01-172022-02-18中证信息技术服务有限责任公司Knowledge graph generation method and device, electronic equipment and storage medium
CN115099927A (en)*2022-06-212022-09-23中国银行股份有限公司Loan risk analysis method and device based on social network analysis
CN118469679A (en)*2024-07-122024-08-09暗物智能科技(广州)有限公司 A financial product recommendation method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107665252A (en)*2017-09-272018-02-06深圳证券信息有限公司A kind of method and device of creation of knowledge collection of illustrative plates
US20180075359A1 (en)*2016-09-152018-03-15International Business Machines CorporationExpanding Knowledge Graphs Based on Candidate Missing Edges to Optimize Hypothesis Set Adjudication
CN107832407A (en)*2017-11-032018-03-23上海点融信息科技有限责任公司For generating the information processing method, device and readable storage medium storing program for executing of knowledge mapping
CN109472485A (en)*2018-11-012019-03-15成都数联铭品科技有限公司Enterprise breaks one's promise Risk of Communication inquiry system and method
CN109657837A (en)*2018-11-192019-04-19平安科技(深圳)有限公司Default Probability prediction technique, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20180075359A1 (en)*2016-09-152018-03-15International Business Machines CorporationExpanding Knowledge Graphs Based on Candidate Missing Edges to Optimize Hypothesis Set Adjudication
CN107665252A (en)*2017-09-272018-02-06深圳证券信息有限公司A kind of method and device of creation of knowledge collection of illustrative plates
CN107832407A (en)*2017-11-032018-03-23上海点融信息科技有限责任公司For generating the information processing method, device and readable storage medium storing program for executing of knowledge mapping
CN109472485A (en)*2018-11-012019-03-15成都数联铭品科技有限公司Enterprise breaks one's promise Risk of Communication inquiry system and method
CN109657837A (en)*2018-11-192019-04-19平安科技(深圳)有限公司Default Probability prediction technique, device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周杰: "探索知识图谱在商业银行风控领域的应用", 《信息技术与标准化》*

Cited By (29)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110704620A (en)*2019-09-252020-01-17海信集团有限公司Method and device for identifying same entity based on knowledge graph
CN110704620B (en)*2019-09-252022-06-10海信集团有限公司Method and device for identifying same entity based on knowledge graph
CN111191039A (en)*2019-09-302020-05-22腾讯科技(深圳)有限公司Knowledge graph creation method, knowledge graph creation device and computer readable storage medium
CN110765117A (en)*2019-09-302020-02-07中国建设银行股份有限公司Fraud identification method and device, electronic equipment and computer-readable storage medium
CN110765117B (en)*2019-09-302023-09-26建信金融科技有限责任公司Fraud identification method, fraud identification device, electronic equipment and computer readable storage medium
CN110751557A (en)*2019-10-102020-02-04中国建设银行股份有限公司Abnormal fund transaction behavior analysis method and system based on sequence model
CN110751557B (en)*2019-10-102023-04-18建信金融科技有限责任公司Abnormal fund transaction behavior analysis method and system based on sequence model
CN110909129B (en)*2019-11-142022-11-04上海秒针网络科技有限公司Abnormal complaint event identification method and device
CN110909129A (en)*2019-11-142020-03-24上海秒针网络科技有限公司Abnormal complaint event identification method and device
CN111056258B (en)*2019-11-202021-08-10秒针信息技术有限公司Method and device for intelligently adjusting conveyor belt
CN111056258A (en)*2019-11-202020-04-24秒针信息技术有限公司Method and device for intelligently adjusting conveyor belt
CN111160847B (en)*2019-12-092023-08-25中国建设银行股份有限公司Method and device for processing flow information
CN111160847A (en)*2019-12-092020-05-15中国建设银行股份有限公司Method and device for processing flow information
CN111178615A (en)*2019-12-242020-05-19成都数联铭品科技有限公司Construction method and system of enterprise risk identification model
CN111178615B (en)*2019-12-242023-10-27成都数联铭品科技有限公司Method and system for constructing enterprise risk identification model
CN111340546A (en)*2020-02-252020-06-26中信银行股份有限公司Method, device, computer equipment and readable storage medium for improving marketing efficiency of banking business
TWI736233B (en)*2020-04-232021-08-11兆豐國際商業銀行股份有限公司Pre-loan investigation system and pre-loan investigation method
CN111507543A (en)*2020-05-282020-08-07支付宝(杭州)信息技术有限公司Model training method and device for predicting business relation between entities
CN111507543B (en)*2020-05-282022-05-17支付宝(杭州)信息技术有限公司Model training method and device for predicting business relation between entities
CN111984798A (en)*2020-09-272020-11-24拉卡拉支付股份有限公司 Atlas data preprocessing method and device
CN112200583A (en)*2020-10-282021-01-08交通银行股份有限公司Knowledge graph-based fraud client identification method
CN112200583B (en)*2020-10-282023-12-19交通银行股份有限公司Knowledge graph-based fraudulent client identification method
CN113449114A (en)*2020-12-312021-09-28中国科学技术大学智慧城市研究院(芜湖)Method for constructing natural human life cycle holographic image based on knowledge graph
CN112990369A (en)*2021-04-262021-06-18四川新网银行股份有限公司Social network-based method and system for identifying waste escaping and debt behaviors
CN113807723A (en)*2021-09-242021-12-17重庆富民银行股份有限公司Risk identification method for knowledge graph
CN113807723B (en)*2021-09-242023-11-03重庆富民银行股份有限公司Risk identification method for knowledge graph
CN114064939A (en)*2022-01-172022-02-18中证信息技术服务有限责任公司Knowledge graph generation method and device, electronic equipment and storage medium
CN115099927A (en)*2022-06-212022-09-23中国银行股份有限公司Loan risk analysis method and device based on social network analysis
CN118469679A (en)*2024-07-122024-08-09暗物智能科技(广州)有限公司 A financial product recommendation method and device

Also Published As

Publication numberPublication date
CN110188198B (en)2021-06-22

Similar Documents

PublicationPublication DateTitle
CN110188198A (en)A kind of anti-fraud method and device of knowledge based map
CN109977151B (en)Data analysis method and system
CN110390465A (en)Air control analysis and processing method, device and the computer equipment of business datum
US11551317B2 (en)Property valuation model and visualization
US11538044B2 (en)System and method for generation of case-based data for training machine learning classifiers
CN104321794B (en)A kind of system and method that the following commercial viability of an entity is determined using multidimensional grading
Duman et al.A novel and successful credit card fraud detection system implemented in a Turkish bank
CN112700324A (en)User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
Garrido et al.A Robust profit measure for binary classification model evaluation
CN117313008B (en)Abnormal node determination method and device, storage medium and electronic device
CN113657990A (en)Ant-lion algorithm optimized NARX neural network risk prediction system and method
CN114187104A (en)Transaction risk detection method and device, computer equipment and storage medium
Karim et al.Scalable semi-supervised graph learning techniques for anti money laundering
CN114493822A (en) A pricing method and system for user default prediction based on transfer learning
AU2018306317A1 (en)System and method for detecting and responding to transaction patterns
CN115114851A (en) Scorecard modeling method and device based on five-fold cross-validation
CN118114257B (en)Private domain data privacy disclosure risk assessment method based on knowledge graph
CN112950350A (en)Loan product recommendation method and system based on machine learning
US20230316308A1 (en)Information processing apparatus, information processing method, and model construction method
Prakash et al.ATM card fraud detection system using machine learning techniques
CN110472680B (en)Object classification method, device and computer-readable storage medium
CN111882339B (en)Prediction model training and response rate prediction method, device, equipment and storage medium
Dar et al.Credit card fraud prevention planning using fuzzy cognitive maps and simulation
Lopez-RojasOn the simulation of financial transactions for fraud detection research
TWI899555B (en) Information processing system, information processing method and program product

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp