Movatterモバイル変換


[0]ホーム

URL:


CN104881608A - XSS vulnerability detection method based on simulating browser behavior - Google Patents

XSS vulnerability detection method based on simulating browser behavior
Download PDF

Info

Publication number
CN104881608A
CN104881608ACN201510262308.8ACN201510262308ACN104881608ACN 104881608 ACN104881608 ACN 104881608ACN 201510262308 ACN201510262308 ACN 201510262308ACN 104881608 ACN104881608 ACN 104881608A
Authority
CN
China
Prior art keywords
url
page
xss
vulnerability
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510262308.8A
Other languages
Chinese (zh)
Other versions
CN104881608B (en
Inventor
王丹
刘源
赵文兵
杜金莲
苏航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of TechnologyfiledCriticalBeijing University of Technology
Priority to CN201510262308.8ApriorityCriticalpatent/CN104881608B/en
Publication of CN104881608ApublicationCriticalpatent/CN104881608A/en
Application grantedgrantedCritical
Publication of CN104881608BpublicationCriticalpatent/CN104881608B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

一种基于模拟浏览器行为的XSS漏洞动态检测方法,爬虫模块含有浏览器的内核,可以模拟浏览器行为来解析JavaScript和加载Ajax以得到页面中隐藏式注入点,相比传统而言,该系统大大增加了对注入点的覆盖。漏洞检测模块使用黑盒测试的方法,在提交攻击向量后,通过模拟浏览器行为检测页面是否有异常情况出现,即能够检测浏览器是否执行了网页脚本,直接判断出当前注入点是否有漏洞,相比传统方法更加准确。此外,该方法完全采用python语言开发,具有易于维护和进行二次开发的特点,对XSS漏洞的检测与研究有非常重要的应用价值。

A dynamic detection method for XSS vulnerabilities based on simulated browser behavior. The crawler module contains the browser's kernel, which can simulate browser behavior to parse JavaScript and load Ajax to obtain hidden injection points in the page. Compared with the traditional Greatly increased coverage of injection points. The vulnerability detection module uses the method of black-box testing. After submitting the attack vector, it detects whether there is any abnormality on the page by simulating the behavior of the browser, that is, it can detect whether the browser executes the web page script, and directly determine whether the current injection point has a vulnerability. more accurate than traditional methods. In addition, the method is completely developed in python language, which has the characteristics of easy maintenance and secondary development, and has very important application value for the detection and research of XSS vulnerabilities.

Description

Translated fromChinese
一种基于模拟浏览器行为的XSS漏洞检测方法A XSS vulnerability detection method based on simulated browser behavior

技术领域technical field

本发明涉及一种基于模拟浏览器行为的XSS漏洞检测方法,属于计算机软件跨站脚本漏洞领域。The invention relates to an XSS loophole detection method based on simulated browser behavior, and belongs to the field of computer software cross-site scripting loopholes.

背景技术Background technique

近年来,随着Web应用的广泛使用,Web安全问题也日益突出。OWASP公布的2013年十大Web应用安全风险中,跨站脚本漏洞XSS(Cross Site Scripting)名列第三,这表明XSS漏洞已成为当前各类网站需共同面对的常见的安全风险之一。In recent years, with the widespread use of Web applications, Web security issues have become increasingly prominent. Among the top ten web application security risks released by OWASP in 2013, cross-site scripting vulnerability XSS (Cross Site Scripting) ranked third, which shows that XSS vulnerability has become one of the common security risks that all kinds of websites need to face.

XSS漏洞的产生是由于来自用户的不可信数据被应用程序在没有进行验证,以及反射回浏览器而没有进行编码或转义的情况下进行了处理,导致浏览器引擎执行了代码时。很多网站在开发过程中忽略了必要的输入验证,缺乏足够的安全性,这样的网站就很容易被跨站脚本攻击。通常攻击者会将恶意脚本提交到存在XSS漏洞的Web页面,当客户端用户浏览该页面时,脚本会被浏览器自动解析执行,达到挂马、钓鱼、盗取用户Cookie、劫持用户Web行为等目的,因此,对XSS漏洞的检测是非常必要的。XSS vulnerabilities arise when untrusted data from the user is processed by the application without validation and reflected back to the browser without encoding or escaping, causing the browser engine to execute code. Many websites ignore the necessary input validation during the development process and lack sufficient security. Such websites are easily attacked by cross-site scripting. Usually attackers will submit malicious scripts to web pages with XSS vulnerabilities. When the client user browses the page, the scripts will be automatically parsed and executed by the browser to achieve horse-mounting, phishing, stealing user cookies, hijacking user web behaviors, etc. Purpose, therefore, the detection of XSS vulnerabilities is very necessary.

一般地,Web页面中可能存在XSS漏洞的地方称为注入点。如何在大量页面中找到潜在的注入点并进行检测是防范XSS漏洞的关键之一,同时也是一项繁杂的工作。在网站内容日益丰富的今天,人工检测注入点显然是不现实的,而需要尽可能采用自动化方法。网络爬虫对于基于网络的自动化测试工具是重要的基础功能,它可以从一个起始URL开始,通过分析网页的内容,运用相关算法找到新的URL并不断地循环抓取网页,直到满足一定的结束条件,从而获取大量的页面以寻找注入点。找到注入点后,测试工具再构造攻击测试请求发送给目标站点,并根据目标站点的回应信息来判断是否存在漏洞。Generally, the place where an XSS vulnerability may exist in a web page is called an injection point. How to find and detect potential injection points in a large number of pages is one of the keys to preventing XSS vulnerabilities, and it is also a complicated task. Today, as the content of websites is becoming more and more abundant, it is obviously unrealistic to manually detect injection points, and it is necessary to use automated methods as much as possible. Web crawler is an important basic function for network-based automated testing tools. It can start from a starting URL, analyze the content of the webpage, use relevant algorithms to find new URLs, and continuously crawl the webpage until a certain end is satisfied. conditions, thereby fetching a large number of pages to find injection points. After finding the injection point, the test tool constructs an attack test request and sends it to the target site, and judges whether there is a vulnerability according to the response information of the target site.

目前针对自动化XSS漏洞检测工具的研究还不是很充足,传统的方法均是以静态爬虫爬取页面,通过获取目标站点的目录结构、对各个页面的源码进行解析,将其中的表单信息提取出来,以达到寻找注入点的目的。然而,注入点很可能隐藏在网页的动态内容中,需要通过用户操作,如点击某个按钮,使浏览器解析JavaScript或加载Ajax才能生成。传统爬虫由于无法模拟浏览器行为,很难解析JavaScript或加载Ajax,从而忽略了隐藏式注入点。同时在页面解析时,它们还需要提取整个表单内容,获取表单的属性以分析向服务器提交数据的方式才能提交攻击向量,比较复杂,并且在漏洞检测方面不能动态地分析目标站点的回应信息,因此未必能判断出XSS漏洞是否存在。At present, the research on automatic XSS vulnerability detection tools is not very sufficient. The traditional method is to crawl pages with static crawlers, obtain the directory structure of the target site, analyze the source code of each page, and extract the form information. In order to achieve the purpose of finding the injection point. However, the injection point is likely to be hidden in the dynamic content of the webpage, and needs to be generated through user actions, such as clicking a button, causing the browser to parse JavaScript or load Ajax. Because traditional crawlers cannot simulate browser behavior, it is difficult to parse JavaScript or load Ajax, thus ignoring hidden injection points. At the same time, when parsing the page, they also need to extract the entire form content, obtain the attributes of the form to analyze the way of submitting data to the server before submitting the attack vector, which is relatively complicated, and cannot dynamically analyze the response information of the target site in terms of vulnerability detection, so It may not be possible to determine whether an XSS vulnerability exists.

发明内容Contents of the invention

本发明采用动态分析,通过检查Web应用程序运行时的行为来检测XSS漏洞,设计并实现了以Ghost.py库为基础的爬虫框架。该系统框架使用黑盒测试判断XSS漏洞是否存在的准确率更高。The invention uses dynamic analysis to detect XSS loopholes by checking the behavior of Web application programs during operation, and designs and implements a crawler framework based on the Ghost.py library. The system framework uses black-box testing to determine the existence of XSS vulnerabilities with higher accuracy.

为达到以上发明目的,本发明采用的技术方案为一种基于模拟浏览器行为的XSS漏洞检测方法,本方法完全由python语言在Windows 64位系统上编写,在Windows 64位系统上正常运行。同时,本方法具有较强的通用型且支持其它操作系统。In order to achieve the purpose of the above invention, the technical solution adopted by the present invention is a method for detecting XSS vulnerabilities based on simulated browser behavior. This method is completely written by the python language on Windows 64-bit systems and runs normally on Windows 64-bit systems. At the same time, the method has strong universality and supports other operating systems.

其中,一种基于模拟浏览器行为的XSS漏洞检测方法实现的系统包括爬虫模块、漏洞检测模块两大模块;这两大模块又包含了若干子模块以实现核心功能,其中:Among them, a system implemented by an XSS vulnerability detection method based on simulated browser behavior includes two modules: a crawler module and a vulnerability detection module; these two modules also include several sub-modules to realize core functions, among which:

(1)爬虫模块包含页面探索模块和网页解析模块两个子模块,两个子模块共同使用Ghost.py作为浏览器引擎,共享URL列表并对其进行操作。页面探索模块实现探索页面对功能,而网页解析模块则实现网页解析功能。页面探索模块使用递归的深度优先爬虫,不断地循环抓取网页存入URL队列,直到将同域名下的页面全部访问完成,从而获取大量的页面以寻找注入点;网页解析模块从URL队列中提取出页面的URL链接,将页面动态加载完,并触发页面中的事件以获取JavaSricpt或Ajax生成的新的URL和注入点。其中,新的URL也会存入URL队列,等待页面探索模块的访问。(1) The crawler module includes two sub-modules, the page exploration module and the web page parsing module. The two sub-modules use Ghost.py as the browser engine to share the URL list and operate on it. The page exploration module realizes the function of exploring page pairs, and the web page analysis module realizes the function of web page analysis. The page exploration module uses a recursive depth-first crawler to continuously crawl web pages and store them in the URL queue until all pages under the same domain name are accessed, thereby obtaining a large number of pages to find injection points; the web page parsing module extracts from the URL queue The URL link of the page is displayed, the page is dynamically loaded, and the event in the page is triggered to obtain the new URL and injection point generated by JavaSricpt or Ajax. Wherein, the new URL will also be stored in the URL queue, waiting for the visit of the page exploration module.

网页解析模块实现功能的步骤包括,The steps for implementing the function of the web page analysis module include:

1)事件的搜集,寻找网页中可能解析JavaScript和加载Ajax的点击事件并触发;1) Collect events, find and trigger click events that may parse JavaScript and load Ajax in web pages;

2)URL搜集,将新的URL放入待访问的URL列表用于探索页面;2) URL collection, putting new URLs into the URL list to be visited for exploring pages;

3)注入点搜集,用于之后的漏洞检测。3) Collection of injection points for subsequent vulnerability detection.

(2)漏洞检测模块:该模块包含自动检测模块和漏洞判断模块两个子模块,两个子模块共同使用Ghost.py作为浏览器引擎,自动检测模块对注入点自动填充攻击向量,采用的攻击向量为RSnake提供的Cheat Sheet,它包括多种绕过XSS检验的攻击向量。这些经过设计的攻击向量提交后,执行结果交由漏洞判断模块来判断,如果存在漏洞的话,页面会执行一个弹出提醒框的脚本,其内容为XSS,此时基于Ghost.py引擎提供的wait_for_alert()函数检测是否有提醒框出现,即可检测网页是否执行了脚本,直接判断出当前注入点是否有漏洞。(2) Vulnerability detection module: This module includes two sub-modules, the automatic detection module and the vulnerability judgment module. The two sub-modules jointly use Ghost.py as the browser engine. The automatic detection module automatically fills the attack vector for the injection point. The attack vector used is The Cheat Sheet provided by RSnake, which includes a variety of attack vectors to bypass XSS verification. After these designed attack vectors are submitted, the execution results will be judged by the vulnerability judgment module. If there is a vulnerability, the page will execute a script that pops up a reminder box, the content of which is XSS. At this time, based on the wait_for_alert( ) function detects whether there is a reminder box, which can detect whether the script is executed on the web page, and directly determine whether the current injection point has a loophole.

在探索页面前,还需要进行网页解析,将页面动态加载完,并触发页面中的事件以获取JavaSricpt或Ajax生成的新的URL和注入点。其中的加载页面由Ghost.py提供的API完成,Before exploring the page, it is necessary to analyze the webpage, dynamically load the page, and trigger the event in the page to obtain the new URL and injection point generated by JavaSricpt or Ajax. The loading page is completed by the API provided by Ghost.py,

本系统使用Python的Beautiful Soup库来完成网页解析。Beautiful Soup是一个用Python写的HTML/XML的解析器,用以处理不规范标记并生成剖析树,并且提供简单又常用的导航,搜索以及修改剖析树的操作。This system uses Python's Beautiful Soup library to complete web page parsing. Beautiful Soup is an HTML/XML parser written in Python to process non-standard tags and generate parse trees, and provide simple and commonly used operations for navigation, search, and modification of parse trees.

综上所述,为了更好地进行自动化检测,本系统实现了以下两个方面的功能:①能够解析JavaScript和加载Ajax以得到页面中隐藏式注入点的支持网络爬虫运行的框架。②通过提交攻击向量以判断XSS漏洞是否存在的高效方法。To sum up, in order to perform automatic detection better, this system implements the following two aspects of functions: ① It can parse JavaScript and load Ajax to obtain the hidden injection point in the page and support the web crawler running framework. ② An efficient method to determine whether an XSS vulnerability exists by submitting an attack vector.

该核心库包括re,pywebfuzz,ghost,bs4,pySide,pyQt,这些库在所有主流的操作系统上运行,因此很好的实现跨平台移植。The core library includes re, pywebfuzz, ghost, bs4, pySide, and pyQt. These libraries run on all mainstream operating systems, so they can be easily ported across platforms.

该系统完全采用python语言开发,具有易于维护和进行二次开发的特点,对XSS漏洞的检测与研究有非常重要的应用价值。The system is completely developed in python language, which has the characteristics of easy maintenance and secondary development, and has very important application value for the detection and research of XSS vulnerabilities.

附图说明Description of drawings

图1系统总体架构(按模块)。Figure 1 System overall architecture (by module).

图2URL处理模型设计。Figure 2 URL processing model design.

图3漏洞检测流程设计。Figure 3 Vulnerability detection process design.

具体实施方式Detailed ways

本方法的原理是基于Ghost.py的对服务器的黑盒测试,它由爬虫模块和漏洞检测模块两个部分组成。系统架构如图1所示。The principle of this method is based on Ghost.py's black-box testing of the server, which consists of two parts: a crawler module and a vulnerability detection module. The system architecture is shown in Figure 1.

1.1爬虫模块1.1 Crawler module

爬虫模块实现探索页面功能和网页解析功能。探索页面的爬虫使用本文提出的递归的深度优先算法,仅挖掘同域名下的页面。该算法描述如算法1所示。The crawler module implements the functions of exploring pages and analyzing web pages. The crawler that explores pages uses the recursive depth-first algorithm proposed in this paper to mine only pages under the same domain name. The algorithm description is shown in Algorithm 1.

算法1.页面探索的深度优先递归算法Algorithm 1. Depth-first recursive algorithm for page exploration

输入:起始网站URLInput: Start Website URL

输出:以输入URL为起点爬取到的所有同域名页面URLOutput: URLs of all pages with the same domain name crawled starting from the input URL

1.设置最大深度MAX_DEPTH;1. Set the maximum depth MAX_DEPTH;

2.设置当前深度depth=0;2. Set the current depth depth=0;

3.如果当前深度大于最大深度,结束;否则,执行步骤4;3. If the current depth is greater than the maximum depth, end; otherwise, go to step 4;

4.访问当前URL;4. Access the current URL;

5.获取页面所有URL存入URL_List;5. Obtain all URLs of the page and store them in URL_List;

6.如果URL_List为空,结束;否则执行步骤7;6. If URL_List is empty, end; otherwise, go to step 7;

将URL_List中下一个URL作为当前URL,当前深度加1,执行步骤3;Take the next URL in URL_List as the current URL, add 1 to the current depth, and execute step 3;

在探索页面前,还需要进行网页解析,将页面动态加载完,并触发页面中的事件以获取JavaSricpt或Ajax生成的新的URL和注入点。其中的加载页面由Ghost.py提供的API完成,Before exploring the page, it is necessary to analyze the webpage, dynamically load the page, and trigger the event in the page to obtain the new URL and injection point generated by JavaSricpt or Ajax. The loading page is completed by the API provided by Ghost.py,

网页解析主要完成三个功能,一是事件搜集,寻找网页中可能解析JavaScript和加载Ajax的点击事件并触发;二是URL搜集,将新的URL放入待访问的URL列表用于探索页面;三是注入点搜集,用于之后的漏洞检测。Web page parsing mainly completes three functions, one is event collection, finds and triggers click events that may parse JavaScript and load Ajax in web pages; the other is URL collection, puts new URLs into the list of URLs to be visited for exploring pages; It is the collection of injection points for later vulnerability detection.

本系统使用Python的Beautiful Soup库来完成网页解析。Beautiful Soup是一个用Python写的HTML/XML的解析器,它可以很好的处理不规范标记并生成剖析树,并且提供简单又常用的导航,搜索以及修改剖析树的操作。This system uses Python's Beautiful Soup library to complete web page parsing. Beautiful Soup is an HTML/XML parser written in Python. It can handle irregular tags and generate parse trees well, and provides simple and commonly used operations for navigating, searching, and modifying parse trees.

(1)触发事件(1) Trigger event

触发事件时,使用Beautiful Soup库搜索带有事件属性的标签,之后用Ghost.py模拟用户点击触发事件。对事件进行点击后可能使浏览器解析JavaScript和加载Ajax,产生DOM元素的改变或者URL的跳转,对此采取不同的方式应对。如果跳转到新的URL,存储当前URL并返回之前页面即可,而产生DOM元素则需再次寻找是否出现了新的事件,直至不再产生DOM元素为止,步骤如算法2描述:When an event is triggered, use the Beautiful Soup library to search for tags with event attributes, and then use Ghost.py to simulate a user click to trigger an event. Clicking on the event may cause the browser to parse JavaScript and load Ajax, resulting in changes in DOM elements or URL jumps, and different methods are used to deal with this. If you jump to a new URL, just store the current URL and return to the previous page, but to generate DOM elements, you need to find out whether there are new events again until no more DOM elements are generated. The steps are as described in Algorithm 2:

算法2.页面DOM元素展开算法Algorithm 2. Page DOM element expansion algorithm

输入:第一次请求得到的页面HTML代码Input: HTML code of the page obtained by the first request

输出:展开后的页面HTML代码Output: HTML code of the expanded page

1.获取所有含有事件的标签存入tag_list,去除重复的标签;1. Obtain all tags containing events and store them in tag_list, and remove duplicate tags;

2.模拟点击tag_list中下一个未访问过的标签;2. Simulate clicking on the next unvisited tag in tag_list;

3.将该标签存入visit[],标记为访问过;3. Store the label in visit[] and mark it as visited;

4.若页面跳转,执行步骤5;否则,执行步骤6;4. If the page jumps, go to step 5; otherwise, go to step 6;

5.将跳转后的页面URL存入URL_List,执行步骤2;5. Save the redirected page URL into URL_List and execute step 2;

如果DOM元素改变,执行步骤1;If the DOM element changes, execute step 1;

通过这种方式,可以将网页不断展开,以达到寻找隐藏式注入点的目的。In this way, the webpage can be continuously expanded to achieve the purpose of finding hidden injection points.

(2)添加URL(2) Add URL

URL超链接一般存在于<a>标签的href属性中,对于HTML中的<a>标签,其href属性的值可以是任何有效文档的相对或绝对URL,包括片段标识符和JavaScript代码段。一般用户点击<a>标签中的内容时,浏览器除了会跳转到href属性指定的URL,也可能会执行JavaScript表达式、方法和函数的列表。URL hyperlinks generally exist in the href attribute of the <a> tag. For the <a> tag in HTML, the value of the href attribute can be a relative or absolute URL of any valid document, including fragment identifiers and JavaScript code snippets. Generally, when a user clicks on the content in the <a> tag, the browser will not only jump to the URL specified by the href attribute, but may also execute a list of JavaScript expressions, methods and functions.

传统的网络爬虫仅采用正则表达式匹配一般URL的形式,这样很可能会漏掉页面和注入点,所以本系统借助带有浏览器引擎的Ghost.py库,对href的值进行多种处理,如图2所示。标准化函数针对不同情况进行字符串处理,将其转换成一般URL的形式。若转换后的URL不在列表中,将存储至URL列表以用于之后的页面挖掘。Traditional web crawlers only use regular expressions to match general URLs, which may miss pages and injection points. Therefore, this system uses the Ghost.py library with a browser engine to perform various processing on the value of href. as shown in picture 2. The normalization function performs string processing for different situations and converts it into a general URL form. If the converted URL is not in the list, it will be stored in the URL list for subsequent page mining.

1.2漏洞检测模块1.2 Vulnerability detection module

(1)漏洞检测(1) Vulnerability detection

本系统采用黑盒测试方法来检测目标表单是否存在XSS漏洞。漏洞检测的基本方法是使用RSnake提供的Cheat Sheet作为攻击向量来填写表单并提交。该Cheat Sheet包括多种可以绕过XSS检验的攻击向量,如图2所示。This system uses the black box testing method to detect whether there is an XSS vulnerability in the target form. The basic method of vulnerability detection is to use the Cheat Sheet provided by RSnake as an attack vector to fill in the form and submit it. The Cheat Sheet includes a variety of attack vectors that can bypass the XSS check, as shown in Figure 2.

这些经过设计的攻击向量提交后,如果存在漏洞,则页面会执行一个弹出提醒框的脚本,其内容为XSS,此时通过Ghost.py提供的wait_for_alert()检测是否有提醒框出现,即检测网页是否执行了脚本,直接判断出当前注入点是否有漏洞。使用此方法时,如果弹出了对话框,且对话框中含有污点数据,则当前表单一定存在XSS漏洞。漏洞检测的执行过程如图3所示。After these designed attack vectors are submitted, if there is a vulnerability, the page will execute a script that pops up an alert box, the content of which is XSS. At this time, the wait_for_alert() provided by Ghost.py is used to detect whether there is an alert box, that is, to detect the web page Whether the script is executed can directly determine whether the current injection point has a vulnerability. When using this method, if a dialog box pops up and contains tainted data, there must be an XSS vulnerability in the current form. The execution process of vulnerability detection is shown in Figure 3.

(2)查找表单及其注入点(2) Find the form and its injection point

如果要提交某一个表单,需要标记该表单在DOM树中的位置,之后使用CSS属性选择器找到它,首先寻找HTML文档中的所有表单并存于数组,标记为form[0]、form[1],之后找到form[0]中input[0],form[1]中的input[1]、input[2],将其name属性存于二维数组中,由于name属性是提交请求时唯一需要的属性,所以其它属性不用保存。If you want to submit a form, you need to mark the position of the form in the DOM tree, and then use the CSS attribute selector to find it. First, find all the forms in the HTML document and store them in an array, marked as form[0], form[1] , and then find input[0] in form[0], input[1] and input[2] in form[1], and store their name attribute in a two-dimensional array, because the name attribute is the only one required when submitting a request attribute, so other attributes do not need to be saved.

(3)自动填写并提交表单(3) Automatically fill and submit the form

本文使用Ghost.py提供的填写表单的函数在表单栏填写XSS攻击向量:This article uses the form-filling function provided by Ghost.py to fill in the XSS attack vector in the form column:

ghost.set_field_value("input[name=%s]"%name,xss)ghost.set_field_value("input[name=%s]"%name,xss)

此外,Ghost.py还可以模拟JavaScript语句来提交表单:In addition, Ghost.py can simulate JavaScript statements to submit forms:

ghost.evaluate("document.querySelectorAll('form')[%d]['submit']();"%form_i),expect_loading=True)ghost.evaluate("document.querySelectorAll('form')[%d]['submit']();"%form_i), expect_loading=True)

表单有可能存在限制输入长度,不允许一些非法字符等前端验证,导致攻击向量不能提交。这些验证事件存在于表单的属性中,需要模拟JavaScript语句将这些属性移除。There may be restrictions on the input length of the form, and some illegal characters are not allowed for front-end verification, so the attack vector cannot be submitted. These validation events exist in the attributes of the form, and need to simulate JavaScript statements to remove these attributes.

document.querySelectorAll('input[type=submit]')[0].removeAttribute('onclick');document.querySelectorAll('input[type=submit]')[0].removeAttribute('onclick');

document.querySelectorAll('input[type=submit]')[0].removeAttribute('onfocus');document.querySelectorAll('input[type=submit]')[0].removeAttribute('onfocus');

之后对表单操作的具体步骤如算法3描述:Afterwards, the specific steps for form operations are described in Algorithm 3:

算法3.自动填充攻击向量提交Algorithm 3. Autofill Attack Vector Submission

输入:存储表单及其注入点的二维数组Input: A two-dimensional array storing the form and its injection point

输出:漏洞检测结果Output: Vulnerability detection results

1.遍历保存全部XSS攻击向量的xss_rsnake数组;1. Traverse the xss_rsnake array that saves all XSS attack vectors;

2.对于表单中的每一个用户输入出,用当前攻击向量填充;2. For each user input and output in the form, fill it with the current attack vector;

3.提交表单;3. Submit the form;

4.根据漏洞检测方法判断是否存在XSS漏洞,如果存在则执行步骤5,;否则执行步骤1;4. Determine whether there is an XSS vulnerability according to the vulnerability detection method, and if it exists, perform step 5; otherwise, perform step 1;

5.存储漏洞在DOM中的位置、当前页面URL及其它信息;5. Store the location of the vulnerability in the DOM, the URL of the current page and other information;

6.结束;6. end;

表1 一些经过设计的攻击向量Table 1 Some designed attack vectors

Claims (1)

Translated fromChinese
1.一种基于模拟浏览器行为的XSS漏洞检测方法,其特征在于:本方法基于Ghost.py的对服务器的黑盒测试,它由爬虫模块和漏洞检测模块两个部分组成;1. A method for detecting XSS vulnerabilities based on simulated browser behavior, characterized in that: the method is based on the black box test of Ghost.py to server, and it is composed of crawler module and vulnerability detection module;1.1爬虫模块1.1 Crawler module爬虫模块实现探索页面功能和网页解析功能;探索页面的爬虫使用本文提出的递归的深度优先算法,仅挖掘同域名下的页面;该算法描述如算法1所示;The crawler module implements the functions of exploring pages and analyzing webpages; the crawler for exploring pages uses the recursive depth-first algorithm proposed in this paper to mine only the pages under the same domain name; the description of the algorithm is shown in Algorithm 1;算法1.页面探索的深度优先递归算法Algorithm 1. Depth-first recursive algorithm for page exploration输入:起始网站URLInput: Start Website URL输出:以输入URL为起点爬取到的所有同域名页面URLOutput: URLs of all pages with the same domain name crawled starting from the input URL1.设置最大深度MAX_DEPTH;1. Set the maximum depth MAX_DEPTH;2.设置当前深度depth=0;2. Set the current depth depth=0;3.如果当前深度大于最大深度,结束;否则,执行步骤4;3. If the current depth is greater than the maximum depth, end; otherwise, go to step 4;4.访问当前URL;4. Access the current URL;5.获取页面所有URL存入URL_List;5. Obtain all URLs of the page and store them in URL_List;6.如果URL_List为空,结束;否则执行步骤7;6. If URL_List is empty, end; otherwise, go to step 7;将URL_List中下一个URL作为当前URL,当前深度加1,执行步骤3;Take the next URL in URL_List as the current URL, add 1 to the current depth, and execute step 3;在探索页面前,还需要进行网页解析,将页面动态加载完,并触发页面中的事件以获取JavaSricpt或Ajax生成的新的URL和注入点;其中的加载页面由Ghost.py提供的API完成,Before exploring the page, it is necessary to analyze the webpage, dynamically load the page, and trigger events in the page to obtain the new URL and injection point generated by JavaSricpt or Ajax; the loading page is completed by the API provided by Ghost.py,网页解析主要完成三个功能,一是事件搜集,寻找网页中可能解析JavaScript和加载Ajax的点击事件并触发;二是URL搜集,将新的URL放入待访问的URL列表用于探索页面;三是注入点搜集,用于之后的漏洞检测;Web page parsing mainly completes three functions, one is event collection, finds and triggers click events that may parse JavaScript and load Ajax in web pages; the other is URL collection, puts new URLs into the list of URLs to be visited for exploring pages; It is the collection of injection points for later vulnerability detection;本方法使用Python的Beautiful Soup库来完成网页解析;Beautiful Soup是一个用Python写的HTML/XML的解析器,它可以很好的处理不规范标记并生成剖析树,并且提供简单又常用的导航,搜索以及修改剖析树的操作;This method uses Python's Beautiful Soup library to complete webpage parsing; Beautiful Soup is an HTML/XML parser written in Python, which can handle irregular tags well and generate a parse tree, and provides simple and commonly used navigation. Search and modify parse tree operations;(1)触发事件(1) Trigger event触发事件时,使用Beautiful Soup库搜索带有事件属性的标签,之后用Ghost.py模拟用户点击触发事件;对事件进行点击后可能使浏览器解析JavaScript和加载Ajax,产生DOM元素的改变或者URL的跳转,对此采取不同的方式应对;如果跳转到新的URL,存储当前URL并返回之前页面即可,而产生DOM元素则需再次寻找是否出现了新的事件,直至不再产生DOM元素为止,步骤如算法2描述:When an event is triggered, use the Beautiful Soup library to search for tags with event attributes, and then use Ghost.py to simulate user clicks to trigger events; clicking on the event may cause the browser to parse JavaScript and load Ajax, resulting in changes in DOM elements or URLs Jump, take different ways to deal with this; if you jump to a new URL, just store the current URL and return to the previous page, but if you want to generate a DOM element, you need to find out whether there is a new event again until no more DOM elements are generated So far, the steps are described in Algorithm 2:算法2.页面DOM元素展开算法Algorithm 2. Page DOM element expansion algorithm输入:第一次请求得到的页面HTML代码Input: HTML code of the page obtained by the first request输出:展开后的页面HTML代码Output: HTML code of the expanded page1.获取所有含有事件的标签存入tag_list,去除重复的标签;1. Obtain all tags containing events and store them in tag_list, and remove duplicate tags;2.模拟点击tag_list中下一个未访问过的标签;2. Simulate clicking on the next unvisited tag in tag_list;3.将该标签存入visit[],标记为访问过;3. Store the label in visit[] and mark it as visited;4.若页面跳转,执行步骤5;否则,执行步骤6;4. If the page jumps, go to step 5; otherwise, go to step 6;5.将跳转后的页面URL存入URL_List,执行步骤2;5. Save the redirected page URL into URL_List and execute step 2;如果DOM元素改变,执行步骤1;If the DOM element changes, execute step 1;通过这种方式,可以将网页不断展开,以达到寻找隐藏式注入点的目的;In this way, the webpage can be continuously expanded to achieve the purpose of finding hidden injection points;(2)添加URL(2) Add URLURL超链接一般存在于<a>标签的href属性中,对于HTML中的<a>标签,其href属性的值可以是任何有效文档的相对或绝对URL,包括片段标识符和JavaScript代码段;一般用户点击<a>标签中的内容时,浏览器除了会跳转到href属性指定的URL,也可能会执行JavaScript表达式、方法和函数的列表;URL hyperlinks generally exist in the href attribute of the <a> tag. For the <a> tag in HTML, the value of the href attribute can be a relative or absolute URL of any valid document, including fragment identifiers and JavaScript code snippets; generally When the user clicks on the content in the <a> tag, the browser will not only jump to the URL specified by the href attribute, but also may execute a list of JavaScript expressions, methods and functions;传统的网络爬虫仅采用正则表达式匹配一般URL的形式,这样很可能会漏掉页面和注入点,所以本系统借助带有浏览器引擎的Ghost.py库,对href的值进行多种处理;标准化函数针对不同情况进行字符串处理,将其转换成一般URL的形式;若转换后的URL不在列表中,将存储至URL列表以用于之后的页面挖掘;Traditional web crawlers only use regular expressions to match general URLs, which is likely to miss pages and injection points. Therefore, this system uses the Ghost.py library with a browser engine to perform various processing on the value of href; The normalization function performs string processing for different situations and converts it into a general URL; if the converted URL is not in the list, it will be stored in the URL list for subsequent page mining;1.2漏洞检测模块1.2 Vulnerability detection module(1)漏洞检测(1) Vulnerability detection本方法采用黑盒测试方法来检测目标表单是否存在XSS漏洞;漏洞检测的基本方法是使用RSnake提供的Cheat Sheet作为攻击向量来填写表单并提交;该Cheat Sheet包括多种可以绕过XSS检验的攻击向量;This method uses a black-box test method to detect whether there is an XSS vulnerability in the target form; the basic method of vulnerability detection is to use the Cheat Sheet provided by RSnake as an attack vector to fill in the form and submit it; the Cheat Sheet includes a variety of attacks that can bypass XSS inspection vector;这些经过设计的攻击向量提交后,如果存在漏洞,则页面会执行一个弹出提醒框的脚本,其内容为XSS,此时通过Ghost.py提供的wait_for_alert()检测是否有提醒框出现,即检测网页是否执行了脚本,直接判断出当前注入点是否有漏洞;使用此方法时,如果弹出了对话框,且对话框中含有污点数据,则当前表单一定存在XSS漏洞;After these designed attack vectors are submitted, if there is a vulnerability, the page will execute a script that pops up an alert box, the content of which is XSS. At this time, the wait_for_alert() provided by Ghost.py is used to detect whether there is an alert box, that is, to detect the web page Whether the script is executed can directly determine whether the current injection point has a vulnerability; when using this method, if a dialog box pops up and contains tainted data, the current form must have an XSS vulnerability;(2)查找表单及其注入点(2) Find the form and its injection point如果要提交某一个表单,需要标记该表单在DOM树中的位置,之后使用CSS属性选择器找到它,首先寻找HTML文档中的所有表单并存于数组,标记为form[0]、form[1],之后找到form[0]中input[0],form[1]中的input[1]、input[2],将其name属性存于二维数组中,由于name属性是提交请求时唯一需要的属性,所以其它属性不用保存;If you want to submit a form, you need to mark the position of the form in the DOM tree, and then use the CSS attribute selector to find it. First, find all the forms in the HTML document and store them in an array, marked as form[0], form[1] , and then find input[0] in form[0], input[1] and input[2] in form[1], and store their name attribute in a two-dimensional array, because the name attribute is the only one required when submitting a request attribute, so other attributes do not need to be saved;(3)自动填写并提交表单(3) Automatically fill and submit the form本文使用Ghost.py提供的填写表单的函数在表单栏填写XSS攻击向量:This article uses the form-filling function provided by Ghost.py to fill in the XSS attack vector in the form column:ghost.set_field_value("input[name=%s]"%name,xss)ghost.set_field_value("input[name=%s]"%name,xss)此外,Ghost.py还可以模拟JavaScript语句来提交表单:In addition, Ghost.py can simulate JavaScript statements to submit forms:ghost.evaluate(ghost. evaluate("document.querySelectorAll('form')[%d]['submit']();"%form_i),expect_loading=True)"document.querySelectorAll('form')[%d]['submit'](); "%form_i), expect_loading=True)表单有可能存在限制输入长度,不允许一些非法字符等前端验证,导致攻击向量不能提交;这些验证事件存在于表单的属性中,需要模拟JavaScript语句将这些属性移除;The form may have a limit on the input length, some illegal characters and other front-end verification are not allowed, resulting in the failure to submit the attack vector; these verification events exist in the attributes of the form, and JavaScript statements need to be simulated to remove these attributes;document.querySelectorAll('input[type=submit]')[0].removeAttribute('onclick');document.querySelectorAll('input[type=submit]')[0].removeAttribute('onclick');document.querySelectorAll('input[type=submit]')[0].removeAttribute('onfocus');document.querySelectorAll('input[type=submit]')[0].removeAttribute('onfocus');之后对表单操作的具体步骤如算法3描述:Afterwards, the specific steps for form operations are described in Algorithm 3:算法3.自动填充攻击向量提交Algorithm 3. Autofill Attack Vector Submission输入:存储表单及其注入点的二维数组Input: A two-dimensional array storing the form and its injection point输出:漏洞检测结果Output: Vulnerability detection results1.遍历保存全部XSS攻击向量的xss_rsnake数组;1. Traverse the xss_rsnake array that saves all XSS attack vectors;2.对于表单中的每一个用户输入出,用当前攻击向量填充;2. For each user input and output in the form, fill it with the current attack vector;3.提交表单;3. Submit the form;4.根据漏洞检测方法判断是否存在XSS漏洞,如果存在则执行步骤5,;否则执行步骤1;4. Determine whether there is an XSS vulnerability according to the vulnerability detection method, and if it exists, perform step 5; otherwise, perform step 1;5.存储漏洞在DOM中的位置、当前页面URL及其它信息;5. Store the location of the vulnerability in the DOM, the URL of the current page and other information;6.结束。6. End.
CN201510262308.8A2015-05-212015-05-21A kind of XSS leak detection methods based on simulation browser behaviorExpired - Fee RelatedCN104881608B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201510262308.8ACN104881608B (en)2015-05-212015-05-21A kind of XSS leak detection methods based on simulation browser behavior

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201510262308.8ACN104881608B (en)2015-05-212015-05-21A kind of XSS leak detection methods based on simulation browser behavior

Publications (2)

Publication NumberPublication Date
CN104881608Atrue CN104881608A (en)2015-09-02
CN104881608B CN104881608B (en)2018-03-16

Family

ID=53949098

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201510262308.8AExpired - Fee RelatedCN104881608B (en)2015-05-212015-05-21A kind of XSS leak detection methods based on simulation browser behavior

Country Status (1)

CountryLink
CN (1)CN104881608B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105243019A (en)*2015-10-272016-01-13北京神州绿盟信息安全科技股份有限公司Method and apparatus for detecting python code bugs
CN105430002A (en)*2015-12-182016-03-23北京奇虎科技有限公司 Vulnerability detection method and device
CN106022135A (en)*2016-02-232016-10-12北京工业大学Automatic detection system capable of dynamically determining XSS vulnerability
CN106845248A (en)*2017-01-182017-06-13北京工业大学A kind of XSS leak detection methods based on state transition graph
CN107085686A (en)*2017-03-242017-08-22深圳市九州安域科技有限公司A kind of detection method and its system of interactive XSS leaks
CN107103241A (en)*2017-03-152017-08-29广西科技大学A kind of method of testing for automatically generating storage-type XSS attack vector
CN107103242A (en)*2017-05-112017-08-29北京安赛创想科技有限公司The acquisition methods and device of data
CN107147645A (en)*2017-05-112017-09-08北京安赛创想科技有限公司The acquisition methods and device of network security data
CN107294918A (en)*2016-03-312017-10-24阿里巴巴集团控股有限公司A kind of fishing webpage detection method and device
CN107506649A (en)*2017-08-252017-12-22福建中金在线信息科技有限公司A kind of leak detection method of html web page, device and electronic equipment
CN107844701A (en)*2016-09-212018-03-27南京大学A kind of cross-site scripting attack detection method for input of analyzing and make a variation based on program
CN111767542A (en)*2020-02-062020-10-13北京沃东天骏信息技术有限公司Unauthorized detection method and device
CN111859387A (en)*2019-04-252020-10-30北京九州正安科技有限公司 An Automatic Construction Method of Android Platform Software Vulnerability Pattern
CN112738127A (en)*2021-01-082021-04-30西安邮电大学 Web-based website and host vulnerability detection system and method
CN113612745A (en)*2021-07-232021-11-05苏州浪潮智能科技有限公司Vulnerability detection method, system, equipment and medium
CN114491560A (en)*2022-01-272022-05-13中国农业银行股份有限公司 A vulnerability detection method, device, storage medium and electronic device
CN114629688A (en)*2022-02-222022-06-14中国人民解放军国防科技大学 A file upload vulnerability mining method and system based on dynamic testing
CN115221529A (en)*2022-09-142022-10-21杭州天谷信息科技有限公司Method and system for injecting abnormity of front-end webpage
CN119652665A (en)*2025-02-132025-03-18北京长亭科技有限公司 A black box Web vulnerability scanning entry collection method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101964025A (en)*2009-07-232011-02-02中联绿盟信息技术(北京)有限公司XSS (Cross Site Scripting) detection method and device
CN102999420A (en)*2011-09-132013-03-27阿里巴巴集团控股有限公司XSS (Cross Site Scripting) testing method and XSS testing system based on DOM (Document Object Model)
CN103026684A (en)*2010-07-222013-04-03国际商业机器公司Cross-site scripting attack protection
US8949990B1 (en)*2007-12-212015-02-03Trend Micro Inc.Script-based XSS vulnerability detection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8949990B1 (en)*2007-12-212015-02-03Trend Micro Inc.Script-based XSS vulnerability detection
CN101964025A (en)*2009-07-232011-02-02中联绿盟信息技术(北京)有限公司XSS (Cross Site Scripting) detection method and device
CN103026684A (en)*2010-07-222013-04-03国际商业机器公司Cross-site scripting attack protection
CN102999420A (en)*2011-09-132013-03-27阿里巴巴集团控股有限公司XSS (Cross Site Scripting) testing method and XSS testing system based on DOM (Document Object Model)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105243019A (en)*2015-10-272016-01-13北京神州绿盟信息安全科技股份有限公司Method and apparatus for detecting python code bugs
CN105430002A (en)*2015-12-182016-03-23北京奇虎科技有限公司 Vulnerability detection method and device
CN106022135A (en)*2016-02-232016-10-12北京工业大学Automatic detection system capable of dynamically determining XSS vulnerability
CN107294918A (en)*2016-03-312017-10-24阿里巴巴集团控股有限公司A kind of fishing webpage detection method and device
CN107844701A (en)*2016-09-212018-03-27南京大学A kind of cross-site scripting attack detection method for input of analyzing and make a variation based on program
CN106845248A (en)*2017-01-182017-06-13北京工业大学A kind of XSS leak detection methods based on state transition graph
CN107103241A (en)*2017-03-152017-08-29广西科技大学A kind of method of testing for automatically generating storage-type XSS attack vector
CN107085686A (en)*2017-03-242017-08-22深圳市九州安域科技有限公司A kind of detection method and its system of interactive XSS leaks
CN107103242A (en)*2017-05-112017-08-29北京安赛创想科技有限公司The acquisition methods and device of data
CN107147645A (en)*2017-05-112017-09-08北京安赛创想科技有限公司The acquisition methods and device of network security data
CN107147645B (en)*2017-05-112020-05-05北京安赛创想科技有限公司Method and device for acquiring network security data
CN107506649A (en)*2017-08-252017-12-22福建中金在线信息科技有限公司A kind of leak detection method of html web page, device and electronic equipment
CN111859387B (en)*2019-04-252024-09-24北京九州正安科技有限公司Automatic construction method for Android platform software vulnerability model
CN111859387A (en)*2019-04-252020-10-30北京九州正安科技有限公司 An Automatic Construction Method of Android Platform Software Vulnerability Pattern
CN111767542A (en)*2020-02-062020-10-13北京沃东天骏信息技术有限公司Unauthorized detection method and device
CN112738127B (en)*2021-01-082023-04-07西安邮电大学Web-based website and host vulnerability detection system and method thereof
CN112738127A (en)*2021-01-082021-04-30西安邮电大学 Web-based website and host vulnerability detection system and method
CN113612745A (en)*2021-07-232021-11-05苏州浪潮智能科技有限公司Vulnerability detection method, system, equipment and medium
CN114491560A (en)*2022-01-272022-05-13中国农业银行股份有限公司 A vulnerability detection method, device, storage medium and electronic device
CN114629688A (en)*2022-02-222022-06-14中国人民解放军国防科技大学 A file upload vulnerability mining method and system based on dynamic testing
CN114629688B (en)*2022-02-222024-03-15中国人民解放军国防科技大学 A file upload vulnerability mining method and system based on dynamic testing
CN115221529A (en)*2022-09-142022-10-21杭州天谷信息科技有限公司Method and system for injecting abnormity of front-end webpage
CN115221529B (en)*2022-09-142022-12-27杭州天谷信息科技有限公司Method and system for injecting abnormity of front-end webpage
CN119652665A (en)*2025-02-132025-03-18北京长亭科技有限公司 A black box Web vulnerability scanning entry collection method and device

Also Published As

Publication numberPublication date
CN104881608B (en)2018-03-16

Similar Documents

PublicationPublication DateTitle
CN104881607B (en)A kind of XSS leakage locations based on simulation browser behavior
CN104881608B (en)A kind of XSS leak detection methods based on simulation browser behavior
CN106022135A (en)Automatic detection system capable of dynamically determining XSS vulnerability
Iqbal et al.Adgraph: A graph-based approach to ad and tracker blocking
Jueckstock et al.Visiblev8: In-browser monitoring of javascript in the wild
CN104956362B (en) Analyze the structure of web applications
US8065667B2 (en)Injecting content into third party documents for document processing
US8424004B2 (en)High performance script behavior detection through browser shimming
CN104956375B (en) Rendering UI elements based on rules
US20110173178A1 (en)Method and system for obtaining script related information for website crawling
CN103279710B (en)Method and system for detecting malicious codes of Internet information system
CN102156832B (en)Security defect detection method for Firefox expansion
CN106844486A (en)Crawl the method and device of dynamic web page
CN102436563A (en)Method and device for detecting page tampering
CN103647678A (en)Method and device for online verification of website vulnerabilities
CN107786537A (en)A kind of lonely page implantation attack detection method based on internet intersection search
CN106022132A (en)Real-time webpage Trojan detection method based on dynamic content analysis
CN112016096B (en) A method and device for auditing XSS vulnerabilities
CN107832622A (en)Leak detection method, device, computer equipment and storage medium
CN107784107B (en) Dark link detection method and device based on escape behavior analysis
WO2020211130A1 (en)Hidden link detection method and apparatus for website
Hou et al.A dynamic detection technique for XSS vulnerabilities
Kim et al.{FuzzOrigin}: Detecting {UXSS} vulnerabilities in browsers through origin fuzzing
Liu et al.A XSS vulnerability detection approach based on simulating browser behavior
CN103390129A (en)Method and device for detecting security of uniform resource locator

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
EXSBDecision made by sipo to initiate substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20180316

Termination date:20210521

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp