CN103379111A

Movatterモバイル変換

Info

Publication number: CN103379111A
Application number: CN2012101297567A
Authority: CN
Inventors: 黄华军
Original assignee: Central South University of Forestry and Technology
Current assignee: Central South University of Forestry and Technology
Priority date: 2012-04-21
Filing date: 2012-04-21
Publication date: 2013-10-30

Abstract

本发明涉及一种网络钓鱼智能防御系统，尤其是由用户行为识别模块、钓鱼网站轻量级智能检测引擎和网络钓鱼智能处理模块组成，用户行为识别模块是基于

Bayes的用户行为识别算法；钓网站轻量级智能检测引擎模块由URL、交互性、网页噪声和站点Logo识别四层进行快速检测，包括融合多层特征的钓鱼URL在线检测算法、基于DOM树的网页服务器提交表单识别与过滤、网页噪声的钓鱼网站检测学习算法和基于站点Logo识别的钓鱼网站检测算法；对浏览器BHO对象规范，对检测出的钓鱼网站，先采用URL地址栏、状态栏、或者其他警示标志提醒用户的处理机制；当用户忽视警示机制，对用户输入的信息进行混淆保护的模块为网络用户提供智能、及时的网络钓鱼防御服务。The invention relates to a phishing intelligent defense system, in particular, it is composed of a user behavior identification module, a phishing website lightweight intelligent detection engine and a phishing intelligent processing module. The user behavior identification module is based on

Bayes' user behavior recognition algorithm; phishing website lightweight intelligent detection engine module consists of four layers of URL, interactivity, webpage noise and site Logo recognition for rapid detection, including phishing URL online detection algorithm that integrates multi-layer features, DOM tree-based Form identification and filtering submitted by the web server, phishing website detection learning algorithm based on webpage noise, and phishing website detection algorithm based on site Logo identification; for browser BHO object specification, for detected phishing websites, first use the URL address bar, status bar, Or other warning signs to remind the user of the processing mechanism; when the user ignores the warning mechanism, the module that confuses and protects the information entered by the user provides intelligent and timely phishing defense services for network users.

Description

A kind of phishing intelligence system of defense

Technical field:

The present invention relates to a kind of phishing intelligence system of defense, a kind of protecting network user's phishing intelligence system of defense.

Background technology:

Phishing (phishing) is based on a kind of attack means of social engineering.It sends the duplicity information that comes from bank or other well-known mechanisms of claiming by spam, instant messenger, SMS or webpage sham publicity, be intended to lure the user to login and seem extremely real fake site, provide a kind of attack pattern of sensitive information (such as user name, password, account ID, ATM PIN code, credit card).

The phishing defence is the countermeasure techniques of phishing, can be divided into server end defence, user side defence and third party's defence.Server end defence refers to web site server end by other technologies, and such as digital watermark, digital finger-print, dynamic security skin (dynamic security skin), double verification protocol etc. prove the authenticity of website identity to the user.User side defence refers at user browser plug-in unit is installed, and detects prompting user or the input of protection user sensitive information etc. behind the fishing webpage.Third party defence comprises the URL blacklist strobe utility, fail-safe software manufacturer defense mechanism, public's protection mechanism of fishing Spam filtering, Third Party Authentication mechanism, browser provider etc.It is target that server end defends to protect the website identity reality, has increased the counterfeit cost of fisherman, produces from source containment fishing website, belongs to Initiative Defense; Both fishing websites of then occurring take defence are as target afterwards, and defense technique falls behind the counterfeit technology of fishing website, belongs to Passive Defence.Though the phishing defence has obtained considerable progress, the Initiative Defense technology exists allows the client user finally judge the website identity reality, and the Passive Defence technology is not installed the defective that plug-in unit just can't be defendd.

Summary of the invention:

For the problems referred to above, the purpose of this invention is to provide a kind of phishing intelligence system of defense, formed by user behavior identification module, fishing website lightweight Intelligent Measurement engine and phishing intelligent processing module, for the network user provides intelligence, timely phishing defence service.

For achieving the above object, the present invention takes following technical scheme:

1, user behavior identification module;

2, fishing website lightweight Intelligent Measurement engine;

3, phishing intelligent processing module;

The present invention is owing to take above technical scheme, and it has the following advantages:

1, based on

The user behavior recognizer of Bayes;

2, the fishing website based on the webpage noise detects learning algorithm;

3, detect learning algorithm based on website Logo identification fishing website.

Embodiment:

(1) user behavior understanding, study and Study of recognition

User behavior is understood and to be comprised that the user behavior formalization is understood and study, user browsing behavior priori probability density distributed data base build and based on

The user behavior identification of Bayes.Adopt investigation on the net questionnaire, manual research questionnaire, send the mode such as mail test at random, obtain in the URL address browse web sites, the access of input information, clickthrough, the user that downloads is normal and the suspicious behavior type of browsing of electronics Email, QQ, shopping website link, adopt similar " behavior of if URL Input Address then normal browsing " rule that the user is browsed capable formalized description, set up the priori probability density regularity of distribution of user browsing behavior, utilize

Bayes sets up the user behavior recognizer, activates when the user may access fishing website and detects engine.

(2) fishing website lightweight Intelligent Measurement engine research

Monthly statistical information by pertinent literature reading and PhishTank and the upper announcement of APAC shows: the phishing attacks number of times comes and go, but target of attack is concentrated, mainly concentrates on the websites such as payment transaction, financial instrument, instant messaging, broadcasting media.According to APAC2011 bulletin in December, the fishing website total amount that relates to Taobao, Tengxun, industrial and commercial bank, Bank of China accounts for 94.39% of whole report amounts.The famous website knowledge base of model is as the Heuristics that detects engine.Comprise in the knowledge base: domain name, IP address, URL, trade (brand) name, copyright information, Logo describe the information that the factor, WHOIS etc. describe identity.The detailed technology route of fishing website lightweight Intelligent Measurement engine research is as follows:

The online fast filtering Mechanism Study of URL adopts white list fast filtering mechanism, to the white list of user add, detects engine and directly ignores detection; Seminar intends adopting the blacklist mechanism of synchronous PhishTank, APWG, Google Safe Browser API, to fishing URL fast filtering, stops user's access.

For the URL that can't judge, adopt the online detection algorithm of fishing URL that merges multilayer feature.This algorithm intends adopting structure, vocabulary, domain name and four layers of feature of server, sets up the learning classification model based on SVM, calculates as the Fast Classification of fishing website URL.

Fishing website interactivity fast filtering Mechanism Study is because the purpose of fishing website is for obtaining user's input information, therefore whether comprise server input list in the analyzing web page, such as form mark, input mark, login logon form, can effectively determine whether fishing website.For the website that does not have server input list, can directly judge not to be fishing website have the website of input list just need to detect from content is similar with vision.Detect for the fishing website interactivity, adopt the identification of web page server submission form and filtration based on dom tree.Utilize the markup language sources program analysis method, make up the webpage dom tree, form, input, login submission form control in the identification dom tree are realized quick fishing webpage classification.

Detect the contents such as navigation bar that learning algorithm research webpage noise refers to that web page template comprises, tissue marker, contact details, advertisement bar based on the fishing website of webpage noise.The webpage noise content comprises the website identity information mostly, and fishing website can be applied mechanically these information of targeted website in order better to confuse the user.Replace the whole content of webpage can realize the website identification with the webpage noise, can reduce again the webpage other guide to the impact of detection algorithm performance and efficient.To the webpage noise, seminar intends adopting n-gram, word frequency vector, the Web information processing technology such as TF-IDF, Shingle that the webpage noise is analyzed, determine the feature mode of webpage noise, make up SVM machine learning and disaggregated model, to judge that the suspected site has used famous website template, but inconsistent again with information in the knowledge base, judge fishing website with this.

The fishing website sorting algorithm research website Logo of website Logo identification is significant point in the webpage, also is user's area-of-interest.Often with website Logo identification website, fishing website also utilizes this characteristics user cheating to the network user.Seminar intends adopting SIFT to analyze famous website Logo characteristics, determine the characterization factor of describing website Logo, frequent counterfeit famous targeted sites Logo carries out modeling to fishing website, to judge that the suspected site has used famous Net station logo, but inconsistent again with information in the knowledge base, judge fishing website with this.

(3) phishing Intelligent treatment mechanism research

Browser BHO interface provides relevant interface specification, for phishing Intelligent treatment fishing website provides interface.Adopt and analyze first the various standards of browser BHO, interface, method, event, common class etc., determine and catch the click of user's mouse, keyboard input behavior, address field and status bar event methods; Input message uses the SHA-1 algorithm to obscure, and realizes phishing Intelligent treatment mechanism by programming.

Claims

1. a phishing intelligence system of defense is characterized in that it is comprised of user behavior identification module, fishing website lightweight Intelligent Measurement engine and phishing intelligent processing module, for the network user provides intelligence, timely phishing defence service.

2. a kind of phishing intelligence system of defense according to claim 1 is characterized in that the user behavior identification module is based on

The user behavior recognizer of Bayes.

3. a kind of phishing intelligence system of defense according to claim 1 is characterized in that fishing website lightweight Intelligent Measurement engine modules is identified four layers by URL, interactivity, webpage noise and website Logo and carried out fast detecting; Comprise the online detection algorithm of fishing URL that merges multilayer feature, detect learning algorithm and based on the fishing website detection algorithm of website Logo identification based on the fishing website of the web page server submission form identification of dom tree and filtration, webpage noise.

4. a kind of phishing intelligence system of defense according to claim 1 is characterized in that, for browser BHO object standard, to detected fishing website, adopts first the treatment mechanism of URL address field, status bar or other warning mark reminding users; When the user ignores caution mechanism, the information of user's input is obscured the module of protection.