CN117235716A

Movatterモバイル変換

Info

Publication number: CN117235716A
Application number: CN202311508777.4A
Authority: CN
Inventors: 张汝云; 白冰; 孙天宁; 张奕鹏; 徐昊天; 孙才俊
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-11-14
Filing date: 2023-11-14
Publication date: 2023-12-15
Anticipated expiration: 2043-11-14
Also published as: CN117235716B

Abstract

本发明公开了一种OOXML文档模板注入攻击的未知威胁防御方法及装置；通过对文档进行还原，对可能存放恶意攻击载荷的关键文件的XML树再次进行递归，将其可能携带恶意病毒的关键标签进行解析，依据预设的可信度字典及评估算法对其中的威胁标签进行解析、评估、删除或重组，从而破坏黑客构造的恶意文档，使其失去攻击性，让用户可以极大程度避免受钓鱼攻击者的文档模板注入攻击的风险。本发明打破传统杀毒软件只能检测的僵局，提出了一种全新的防御方案，在基本不影响用户正常使用文档的情况下，同时对恶意文档的防御能力具备通用性，在理论上可以预防所有已知的模板注入类型攻击及同类型的变种攻击、nday漏洞、0day漏洞等攻击手段。

The present invention discloses an unknown threat defense method and device for OOXML document template injection attacks; by restoring the document, the XML tree of key files that may store malicious attack loads is recursed, and key tags that may carry malicious viruses are recursively Analyze, evaluate, delete or reorganize the threat tags based on the preset credibility dictionary and evaluation algorithm, thereby destroying the malicious documents constructed by hackers and making them less offensive, allowing users to avoid being harmed to a great extent. Risk of document template injection attacks by phishing attackers. This invention breaks the deadlock that traditional anti-virus software can only detect, and proposes a brand-new defense scheme that basically does not affect the user's normal use of documents. At the same time, the defense capability against malicious documents is universal and can theoretically prevent all Known template injection type attacks and variant attacks of the same type, nday vulnerabilities, 0day vulnerabilities and other attack methods.

Description

Translated fromChinese

一种OOXML文档模板注入攻击的未知威胁防御方法及装置An unknown threat defense method and device for OOXML document template injection attack

技术领域Technical field

本发明涉及计算机安全技术领域，尤其涉及一种OOXML文档模板注入攻击的未知威胁防御方法及装置。The present invention relates to the field of computer security technology, and in particular to an unknown threat defense method and device for OOXML document template injection attacks.

背景技术Background technique

现有技术中，静态检测技术无法有效应对0 day漏洞等未知威胁，且特征库更新存在较大的滞后性，动态检测技术也存在被动态免杀技术绕过的可能，动态运行时恶意载荷已经具备了运行条件，出现的状况难以预料，并且动态检测技术会消耗系统资源，影响运行效率，降低用户使用体验；In the existing technology, static detection technology cannot effectively deal with unknown threats such as zero-day vulnerabilities, and there is a large lag in signature library updates. Dynamic detection technology also has the possibility of being bypassed by dynamic anti-virus technology, and malicious loads have been Once the operating conditions are met, the situation that occurs is unpredictable, and dynamic detection technology will consume system resources, affect operating efficiency, and reduce user experience;

因此，本发明针对office文档安全性无法得到有效保障的难题，打破传统思路，从数据文件的内容格式出发，依据预设的可信度规则，对其进行清洗，确保文件在未被运行前将威胁降至最低，且预设方式灵活，可以根据需求自定可信度规则来实现对可用性及安全性的侧重方向，在最严苛的信任度条件下，理论上可以预防所有模板注入类型的0 day攻击及其他未知威胁，包括已知威胁也能够被一并清理，并且此技术在不需要打开未知文档的前提下即可对其进行清理，恶意载荷也就无法在被清理时获得执行权限，很好的规避了动态监测技术的弊端。Therefore, in order to solve the problem that the security of office documents cannot be effectively guaranteed, the present invention breaks the traditional thinking and starts from the content format of the data file and cleans it according to the preset credibility rules to ensure that the file will be deleted before being run. Threats are minimized, and the preset method is flexible. Trustworthiness rules can be customized according to needs to focus on usability and security. Under the most stringent trust conditions, all template injection types can theoretically be prevented. 0-day attacks and other unknown threats, including known threats, can also be cleaned together. This technology can clean unknown documents without opening them, and malicious payloads cannot obtain execution permissions when being cleaned. , which well avoids the disadvantages of dynamic monitoring technology.

发明内容Contents of the invention

本发明的目的在于针对现有技术的不足，提出一种OOXML文档模板注入攻击的未知威胁防御方法及装置，使得用户可以在不打开文件的情况下，降低文档中存在模板注入攻击的威胁风险。The purpose of the present invention is to address the shortcomings of the existing technology and propose an unknown threat defense method and device for OOXML document template injection attacks, so that users can reduce the threat risk of template injection attacks in documents without opening the file.

为实现上述目的，本发明提供了一种OOXML文档模板注入攻击的未知威胁防御方法，包括以下步骤：In order to achieve the above purpose, the present invention provides an unknown threat defense method for OOXML document template injection attacks, which includes the following steps:

（1）获取待处理的OOXML文档，初始化威胁地图和可信度字典；(1) Obtain the OOXML document to be processed, and initialize the threat map and credibility dictionary;

（2）判断待处理的OOXML文档是否为加密文档；若是，则先通过用户或计算机程序解除密码，再通过压缩算法将待处理的OOXML文档进行压缩并还原，得到各类组件文件；否则直接通过压缩算法将待处理的OOXML文档进行压缩并还原，得到各类组件文件；(2) Determine whether the OOXML document to be processed is an encrypted document; if so, first remove the password through the user or computer program, and then compress and restore the OOXML document to be processed through the compression algorithm to obtain various component files; otherwise, directly pass The compression algorithm compresses and restores the OOXML document to be processed to obtain various component files;

（3）对所述各类组件文件进行结构识别，确定其归属于哪类威胁地图；依据预设的威胁地图，遍历各类组件文件，将匹配到的、非完全可信的文件存储到一个列表变量中；(3) Conduct structural identification of the various component files to determine which type of threat map they belong to; traverse various component files based on the preset threat map, and store the matched, non-completely trusted files into a in list variable;

（4）通过XML格式标准对列表变量中文件的可用性进行校验，剔除文件中未闭合、异常符号标签的不可用数据；(4) Verify the availability of files in list variables through XML format standards, and eliminate unavailable data in files that are not closed or have abnormal symbol tags;

（5）基于预设的可信度字典，通过解析文件中各个XML标签的标签名赋予其初始可信度评分；再根据XML标签名对应的标签值，通过匹配规则进行匹配，依据匹配时的匹配数量、匹配长度给出各个XML标签的威胁分值，最后得到威胁评分；(5) Based on the preset credibility dictionary, the tag name of each XML tag in the file is parsed to give it an initial credibility score; then based on the tag value corresponding to the XML tag name, matching is performed through matching rules. The number of matches and match length give the threat score of each XML tag, and finally the threat score is obtained;

（6）判断步骤（5）得到的威胁评分是否低于阈值，若是，则对文件中的XML标签进行威胁清除，即删除整个XML标签或修改其标签值；否则进入步骤（8）；(6) Determine whether the threat score obtained in step (5) is lower than the threshold. If so, perform threat elimination on the XML tag in the file, that is, delete the entire XML tag or modify its tag value; otherwise, proceed to step (8);

（7）进行二次校验，重复步骤（5）和步骤（6），直至步骤（6）中的威胁评分大于或等于阈值；(7) Perform a second verification and repeat steps (5) and (6) until the threat score in step (6) is greater than or equal to the threshold;

（8）将处理后的所有组件文件通过压缩算法重新打包压缩，还原回OOXML文档，此时的OOXML文档即为被清理病毒载荷后的文件。(8) Repackage and compress all processed component files through the compression algorithm and restore them back to the OOXML document. At this time, the OOXML document is the file after the virus load has been cleaned.

进一步地，所述步骤（1）中，威胁地图采用黑名单形式，以列表形式存储规定的可能携带攻击载荷的文件；威胁地图内置了不同格式所使用的可信度字典，根据待处理的OOXML文档的格式初始化威胁地图；可信度字典定义了可能存在威胁风险的标签集合，标签集合中的每个标签内定义了初始可信度评分、各标签值的匹配规则，以及定义了基于长度或匹配次数的评分标准。Further, in step (1), the threat map adopts the form of a blacklist to store specified files that may carry attack payloads in a list form; the threat map has built-in credibility dictionaries used in different formats, and according to the OOXML to be processed The format of the document initializes the threat map; the credibility dictionary defines a set of tags that may have threat risks. Each tag in the tag set defines an initial credibility score, a matching rule for each tag value, and a definition based on length or Scoring criteria for number of matches.

进一步地，所述步骤（2）中，压缩算法基于ECMA-376 Office Open XML格式标准。Further, in step (2), the compression algorithm is based on the ECMA-376 Office Open XML format standard.

进一步地，所述步骤（3）中，对所述各类组件文件进行结构识别包括结构识别和名称识别。Further, in the step (3), performing structure identification on the various types of component files includes structure identification and name identification.

进一步地，所述步骤（5）中，所述匹配规则包括正则表达式匹配和字符串匹配。Further, in step (5), the matching rules include regular expression matching and string matching.

进一步地，所述步骤（5）中，所述威胁评分由初始XML标签的可信度评分与威胁分值相加得到。Further, in step (5), the threat score is obtained by adding the credibility score of the initial XML tag and the threat score.

进一步地，所述阈值大于或等于60。Further, the threshold is greater than or equal to 60.

为实现上述目的，本发明还提供了一种OOXML文档模板注入攻击的未知威胁防御装置，包括一个或多个处理器，用于实现上述的OOXML文档模板注入攻击的未知威胁防御方法。In order to achieve the above object, the present invention also provides an unknown threat defense device for OOXML document template injection attacks, including one or more processors, used to implement the above unknown threat defense method for OOXML document template injection attacks.

为实现上述目的，本发明还提供了一种电子设备，包括存储器和处理器，所述存储器与所述处理器耦接；其中，所述存储器用于存储程序数据，所述处理器用于执行所述程序数据以实现上述的OOXML文档模板注入攻击的未知威胁防御方法。To achieve the above object, the present invention also provides an electronic device, including a memory and a processor, the memory is coupled to the processor; wherein the memory is used to store program data, and the processor is used to execute the program data. Describe the program data to implement the unknown threat defense method of OOXML document template injection attack mentioned above.

为实现上述目的，本发明还提供了一种计算机可读存储介质，其上存储有计算机程序，所述程序被处理器执行时实现上述的OOXML文档模板注入攻击的未知威胁防御方法。To achieve the above object, the present invention also provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the above-mentioned unknown threat defense method of OOXML document template injection attack is implemented.

与现有技术相比，本发明的有益效果在于：Compared with the prior art, the beneficial effects of the present invention are:

1、不同于传统对文件进行动态、静态分析的检测技术，本发明采取攻击载荷清除的方式，对输入文件的恶意攻击载荷统一进行破坏，主动出击，从根源上阻止了攻击实施的可能性，规避了传统杀毒软件只能检测且存在漏检的技术弊端。1. Different from the traditional detection technology that performs dynamic and static analysis on files, this invention adopts the method of attack load removal to uniformly destroy the malicious attack loads of the input file and take the initiative to prevent the possibility of attack from the root. This avoids the technical shortcomings of traditional anti-virus software that can only detect and miss detection.

2、本发明相较于传统动态检测技术，不需要打开文档，在对文档进行解体和处理的过程中不会触发恶意攻击。2. Compared with traditional dynamic detection technology, the present invention does not need to open the document, and will not trigger malicious attacks during the process of dismantling and processing the document.

3、本发明打破传统杀毒软件只能检测的僵局，提出了一种全新的防御方案，在基本不影响用户正常使用文档的情况下，同时对恶意文档的防御能力具备通用性，在理论上可以预防所有已知的模板注入类型攻击及同类型的变种攻击、nday漏洞、0day漏洞等攻击手段。3. The present invention breaks the deadlock that traditional anti-virus software can only detect, and proposes a brand-new defense solution, which basically does not affect the normal use of documents by users, and at the same time has universal defense capabilities against malicious documents. In theory, it can Prevent all known template injection attacks and variant attacks of the same type, nday vulnerabilities, 0day vulnerabilities and other attack methods.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting any creative effort.

图1为本发明方法的流程示意图；Figure 1 is a schematic flow chart of the method of the present invention;

图2为本发明中标签威胁评分流程示意图；Figure 2 is a schematic diagram of the tag threat scoring process in the present invention;

图3为本发明中威胁地图配置文件的结构示意图；Figure 3 is a schematic structural diagram of the threat map configuration file in the present invention;

图4为本发明模板注入攻击文档的攻击载荷示意图；Figure 4 is a schematic diagram of the attack load of the template injection attack document according to the present invention;

图5为本发明装置的结构示意图；Figure 5 is a schematic structural diagram of the device of the present invention;

图6为一种电子设备的示意图。Figure 6 is a schematic diagram of an electronic device.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.

在本发明使用的术语是仅仅出于描述特定实施例的目的，而非旨在限制本发明。在本发明和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。还应当理解，本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this disclosure and the appended claims, the singular forms "a," "the" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

应当理解，尽管在本发明可能采用术语第一、第二、第三等来描述各种信息，但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如，在不脱离本发明范围的情况下，第一信息也可以被称为第二信息，类似地，第二信息也可以被称为第一信息。取决于语境，如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used in the present invention to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other. For example, without departing from the scope of the present invention, the first information may also be called second information, and similarly, the second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to determining."

下面结合附图及实施例，对本发明进行进一步详细说明。应当理解，本发明可以用许多不同的形式实现，而不应当认为限于这里所述的实施例。相反，提供这些实施例以便使本发明公开透明且完整，并且将向本领域技术人员充分表达本发明的范围。此外，下面所描述的各实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。The present invention will be further described in detail below in conjunction with the accompanying drawings and examples. It will be understood that the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. In addition, the technical features involved in each embodiment described below can be combined with each other as long as they do not conflict with each other.

本发明的提出的一种OOXML文档模板注入攻击的未知威胁防御方法及装置，使得用户可以在不打开文件的情况下，降低文档中存在模板注入攻击的威胁风险。The invention proposes an unknown threat defense method and device for OOXML document template injection attacks, which enables users to reduce the threat risk of template injection attacks in documents without opening the file.

参见图1，本发明提供的一种OOXML文档模板注入攻击的未知威胁防御方法，该方法包括以下步骤：Referring to Figure 1, the present invention provides an unknown threat defense method for OOXML document template injection attacks. The method includes the following steps:

步骤一：获取待处理的OOXML文档，初始化威胁地图和可信度字典。Step 1: Obtain the OOXML document to be processed, and initialize the threat map and credibility dictionary.

具体地，读取自定义配置文件，结构如图3所示每种文件类型拥有独立的可信度字典，每个字典中对可能出现的标签预设了不同的匹配规则及相应的标签威胁评分计算规则，对威胁地图、可信度字典进行初始化，其中威胁地图（thread_map）采用黑名单形式，以列表形式存储规定的可能携带攻击载荷（payload）的文件，以供遍历清理；威胁地图内置了不同格式所使用的特定字典（即可信度字典），可以根据待处理的OOXML文档的格式的不同，初始化为不同内容的威胁地图；可信度字典（thread_dict）定义了可能存在威胁风险的标签集合，可信度字典中的每个标签内定义了其初始可信度评分，其次还定义了各标签值的匹配规则，匹配规则如正则表达式、字符串匹配等，以及定义了基于长度或匹配次数的评分标准；设定特殊威胁处置规则，对特定标签，特定标签即无法使用分值累计形式来评估恶意程度的标签，清除规定了特定的处理方式，如：不计数其得分直接对其进行剔除等不限于方案主体处理逻辑的特殊操作。Specifically, the custom configuration file is read. The structure is shown in Figure 3. Each file type has an independent credibility dictionary. Each dictionary is preset with different matching rules and corresponding label threat scores for possible tags. Calculate the rules and initialize the threat map and credibility dictionary. The threat map (thread_map) adopts the form of a blacklist and stores specified files that may carry attack loads (payload) in list form for traversal and cleaning; the threat map has a built-in The specific dictionaries used in different formats (ie, the credibility dictionary) can be initialized into threat maps with different contents according to the format of the OOXML document to be processed; the credibility dictionary (thread_dict) defines tags that may contain threat risks Set, each tag in the credibility dictionary defines its initial credibility score, and secondly, the matching rules for each tag value are also defined, such as regular expressions, string matching, etc., and definitions based on length or Scoring criteria for the number of matches; set special threat handling rules. For specific tags, specific tags cannot use the score accumulation form to evaluate the degree of maliciousness. Clear and stipulate specific processing methods, such as: do not count their scores and directly target them. Perform special operations such as elimination that are not limited to the processing logic of the program body.

步骤二：判断待处理的OOXML文档是否为加密文档；若是，则先通过用户或计算机程序解除密码，再通过压缩算法将待处理的OOXML文档进行压缩并还原，得到各类组件文件；否则直接通过压缩算法将待处理的OOXML文档进行压缩并还原，得到各类组件文件。Step 2: Determine whether the OOXML document to be processed is an encrypted document; if so, first remove the password through the user or computer program, and then compress and restore the OOXML document to be processed through the compression algorithm to obtain various component files; otherwise, directly through The compression algorithm compresses and restores the OOXML document to be processed to obtain various component files.

具体地，通过ECMA-376 Office Open XML 格式标准所规定的压缩算法将待处理的OOXML文档进行压缩并还原，以便于后续处理，采用ECMA-376 Office Open XML 格式标准所规定的压缩算法能够有效地减小文件的体积，并且不会损失任何数据。在还原的过程中，文件将被解压缩并恢复成各类组件文件，例如：PowerPoint文档还原后的文件结构中，PPT文件 / ppt / _slides / _rels目录下所包含的文件就是存储远程模版的路径之一，由于OOXML的格式标准所规定，这些组件文件将按照特定的文件夹结构存储在产出的文件夹中。但是，有些输入文件可能无法正常进行还原，因其文件本身损坏、压缩算法不合规或者缺少必要标识信息等原因，对于这些无法正常进行还原的文件，将会被丢弃并记录在日志中，以便于后续的排查和处理。另外，对于带有密码加密的文档，也无法在这一步骤进行正常的还原。这是由于Word等OOXML文档本身提供了密码保护功能，需要先解密才能进行解压缩。因此，这些带有密码加密的文档可以先通过用户解除密码，或将密码通过计算机程序来实现自动解密，解密后再通过ECMA-376 Office Open XML 格式标准所规定的压缩算法将待处理的OOXML文档进行压缩并还原。Specifically, the OOXML document to be processed is compressed and restored through the compression algorithm specified by the ECMA-376 Office Open XML format standard to facilitate subsequent processing. Using the compression algorithm specified by the ECMA-376 Office Open XML format standard can effectively Reduce file size without losing any data. During the restoration process, the files will be decompressed and restored into various component files. For example: in the file structure after the restoration of the PowerPoint document, the files contained in the PPT file/ppt/_slides/_rels directory are the paths to store remote templates. First, due to the OOXML format standard, these component files will be stored in the output folder according to a specific folder structure. However, some input files may not be restored normally because the file itself is damaged, the compression algorithm is not compliant, or the necessary identification information is missing. These files that cannot be restored normally will be discarded and recorded in the log so that for subsequent investigation and processing. In addition, documents with password encryption cannot be restored normally at this step. This is because OOXML documents such as Word itself provide password protection and need to be decrypted before they can be decompressed. Therefore, these password-encrypted documents can first be decrypted by the user, or the password can be automatically decrypted through a computer program. After decryption, the OOXML document to be processed can be decrypted through the compression algorithm specified by the ECMA-376 Office Open XML format standard. Compress and restore.

步骤三：对所述各类组件文件进行结构识别，确定其归属于哪类威胁地图，遍历各类组件文件，依据预设的威胁地图，将匹配到的、非完全可信的文件存储到一个列表变量中。Step 3: Perform structural identification on the various component files, determine which type of threat map they belong to, traverse the various component files, and store the matched, non-completely trusted files into a file based on the preset threat map. in the list variable.

具体地，对压缩还原后的各类组件文件进行结构识别，如doc、xls、ppt等，每一种文件类型在ECMA-376 Office Open XML 格式标准中所规定的结构树也不相同，因此需要依据其压缩还原后各类组件文件的结构及名称，确定其归属于哪类威胁地图，遍历步骤二所产出的所有文件，依据预设的威胁地图，遍历压缩还原得到的文件产物，将其中匹配到存在于威胁地图中的非完全可信的文件存储到一个列表变量中供步骤四和步骤五进行遍历。Specifically, perform structural identification on various types of component files after compression and restoration, such as doc, xls, ppt, etc. Each file type has a different structure tree specified in the ECMA-376 Office Open XML format standard, so it is necessary to According to the structure and name of various component files after compression and restoration, determine which type of threat map they belong to, traverse all the files produced in step 2, traverse the file products obtained by compression and restoration according to the preset threat map, and combine them The files that match the non-completely trusted files that exist in the threat map are stored in a list variable for traversal in steps four and five.

步骤四：通过XML格式标准对列表变量中文件的可用性进行校验，剔除文件中未闭合、异常符号标签的不可用数据。Step 4: Verify the availability of the files in the list variables through XML format standards, and eliminate unavailable data in the files that are not closed or have abnormal symbol tags.

具体地，通过XML格式标准对文件可用性进行校验，剔除文件中未闭合、异常符号标签等不可用数据，在输入评分之前对文件进行内容可用性、内容完整性校验，从而降低未知攻击形式触发处理模块异常的几率。Specifically, the file availability is verified through XML format standards, unavailable data such as unclosed and abnormal symbol tags are removed from the file, and the content availability and content integrity of the file are verified before the score is input, thereby reducing the triggering of unknown attack forms. The probability of handling module exceptions.

步骤五：基于预设的可信度字典，通过解析文件中各个XML标签的标签名赋予其初始可信度评分；再根据XML标签名对应的标签值，通过正则表达式、字符串的匹配规则进行匹配，依据不同匹配规则进行匹配时的匹配数量、匹配长度给出各个XML标签的威胁分值，最后得到威胁评分。Step 5: Based on the preset credibility dictionary, assign an initial credibility score to each XML tag in the file by parsing its tag name; then based on the tag value corresponding to the XML tag name, use regular expressions and string matching rules Matching is performed, and the threat score of each XML tag is given based on the number of matches and the length of matches when matching with different matching rules, and finally the threat score is obtained.

具体地，依据预设的可信度字典，比对文件中的XML标签名及XML标签值，依据不同计算规则给出该标签最后总体的威胁评分；流程如图2所示，输入文件后，依据可信度字典，通过解析其文件中存在的所有标签的名称赋予其初始可信度评分，这个分值取于一个 [0,100] 的区间中；再通过标签值匹配正则表达式、字符串等匹配规则，依据不同匹配规则的匹配数量、匹配长度来给出标签威胁分值；例如基于命中次数的分值计算规则：标签威胁分值=命中次数*风险倍数*分值基础；将初始可信度评分、标签威胁分值总和两者相加得出此标签的最终评分（即威胁评分）。Specifically, based on the preset credibility dictionary, the XML tag name and XML tag value in the file are compared, and the final overall threat score of the tag is given according to different calculation rules; the process is shown in Figure 2. After inputting the file, According to the credibility dictionary, it is given an initial credibility score by parsing the names of all tags present in the file. This score is taken in an interval of [0,100]; then the tag value is matched with regular expressions, strings, etc. Matching rules give tag threat scores based on the number of matches and match lengths of different matching rules; for example, the score calculation rule based on the number of hits: tag threat score = number of hits * risk multiplier * score basis; the initial credibility The sum of the degree score and the label's threat score is added to obtain the final score of this label (that is, the threat score).

步骤六：判断步骤五得到的威胁评分是否低于阈值，若是，则对文件中的XML标签进行威胁清除，即删除整个XML标签或修改其标签值；否则进入步骤八。Step 6: Determine whether the threat score obtained in Step 5 is lower than the threshold. If so, remove the threat from the XML tag in the file, that is, delete the entire XML tag or modify its tag value; otherwise, proceed to Step 8.

具体地，依据步骤五给出的威胁评分对文件内容的XML标签进行选择性删除整个标签或修改其标签值，如图4所示，在一个负责存储模版的XML文件中，每一个Relationship（关系）标签都含有Target（路径）、Type（类型）、Id（序列）三个属性。其中Type属性记录了该模版的类型、Id属性记录了该模版在文档中的序列、Target属性记录了模版所在的路径，此步骤通过预设配置读取威胁清除阈值，阈值越高代表对威胁的容忍度越低，阈值越低代表对可用性的要求越高，例如：设定的阈值为大于或等于60，则代表各个威胁标签在经过威胁评估后给出的得分如果低于60就会被认定具有模板注入攻击的风险，在此步骤下，将会对这类标签的标签值清空或将整个标签摘除来达到预防未知攻击的效果，例如：在绝大多数的情况下，“文档模板和加载项”过于专业，普通用户基本上没有使用场景，类似于VBA宏，但却是目前模板注入攻击的主流技术，具有极大的风险，在此场景下对步骤五匹配到的远程模板标签给出极低的评分，并在本步骤中将其删除，可以有效实现攻击无效化。Specifically, based on the threat score given in step 5, the XML tag of the file content is selectively deleted or the tag value is modified. As shown in Figure 4, in an XML file responsible for storing templates, each Relationship (relationship) ) tags contain three attributes: Target (path), Type (type), and Id (sequence). The Type attribute records the type of the template, the Id attribute records the sequence of the template in the document, and the Target attribute records the path where the template is located. This step reads the threat removal threshold through the preset configuration. The higher the threshold, the greater the response to the threat. The lower the tolerance and the lower the threshold, the higher the requirements for usability. For example, if the threshold is set to be greater than or equal to 60, it means that if the score given by each threat label after threat assessment is less than 60, it will be identified. There is a risk of template injection attacks. In this step, the tag value of such tags will be cleared or the entire tag will be removed to prevent unknown attacks. For example: in most cases, "document templates and loading "Item" is too professional and has basically no usage scenarios for ordinary users. It is similar to VBA macros, but it is currently the mainstream technology of template injection attacks and carries great risks. In this scenario, the remote template tag matched in step 5 is given. Extremely low scores and deleting them in this step can effectively neutralize the attack.

步骤七：进行二次校验，重复步骤（5）和步骤（6），直至步骤（6）中的威胁评分大于或等于阈值。Step 7: Perform a second verification and repeat steps (5) and (6) until the threat score in step (6) is greater than or equal to the threshold.

具体地，如果步骤六有删除或修改内容（即有任意标签低于了安全评分阈值），则进行二次校验，返回步骤五，重新进行风险得分评估和攻击载荷清理步骤，直至威胁清除模块认为输入内容已经足够安全，输出结果并未做出任何清理，处于完全可信状态，才可进入下一步骤，此步骤的目的是为了实现对payload双写绕过等黑客常用的规避手段进行防御。Specifically, if the content is deleted or modified in step six (that is, any label is lower than the security score threshold), a second verification will be performed, and the step will be returned to step five to redo the risk score assessment and attack load cleaning steps until the threat elimination module Only when you think that the input content is secure enough and the output result has not been cleaned in any way and is completely trustworthy can you proceed to the next step. The purpose of this step is to defend against evasion methods commonly used by hackers such as payload double-write bypass. .

二次校验包括：对比文件输入输出的前后一致性来评估安全性以及决定是否需要再次进行威胁清除；对威胁清除之后仍可能存在风险的产出文件，复用了本发明中步骤五的威胁评分和步骤六的威胁清除，以此实现对黑客常用双写攻击载荷等规避手段的反制。The secondary verification includes: comparing the consistency of the file input and output to evaluate the security and deciding whether the threat needs to be eliminated again; for the output files that may still have risks after the threat is eliminated, the threats in step five of the present invention are reused. Score and threat elimination in Step 6 to counter evasion methods commonly used by hackers such as double-writing attack payloads.

步骤八：将处理后的所有文件以压缩算法重新打包压缩，还原回OOXML文档，以确保用户能够直接通过正常的Office软件打开处理后的文档，并为其恢复其初始的文件后缀，得到的文档即为被清理病毒载荷后的文件。Step 8: Repackage and compress all processed files using the compression algorithm, and restore them back to OOXML documents to ensure that users can directly open the processed documents through normal Office software and restore their original file suffixes to the resulting documents. It is the file after the virus load has been cleaned.

与前述OOXML文档模板注入攻击的未知威胁防御方法的实施例相对应，本发明还提供了OOXML文档模板注入攻击的未知威胁防御装置的实施例。Corresponding to the foregoing embodiments of the unknown threat defense method for OOXML document template injection attacks, the present invention also provides embodiments of an unknown threat defense apparatus for OOXML document template injection attacks.

参见图5，本发明实施例提供的OOXML文档模板注入攻击的未知威胁防御装置，包括一个或多个处理器，用于实现上述实施例中的OOXML文档模板注入攻击的未知威胁防御方法。Referring to Figure 5, an unknown threat defense device for OOXML document template injection attacks provided by an embodiment of the present invention includes one or more processors for implementing the unknown threat defense method for OOXML document template injection attacks in the above embodiments.

本发明OOXML文档模板注入攻击的未知威胁防御装置的实施例可以应用在任意具备数据处理能力的设备上，该任意具备数据处理能力的设备可以为诸如计算机等设备或装置。装置实施例可以通过软件实现，也可以通过硬件或者软硬件结合的方式实现。以软件实现为例，作为一个逻辑意义上的装置，是通过其所在任意具备数据处理能力的设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言，如图5所示，为本发明OOXML文档模板注入攻击的未知威胁防御装置所在任意具备数据处理能力的设备的一种硬件结构图，除了图5所示的处理器、内存、网络接口、以及非易失性存储器之外，实施例中装置所在的任意具备数据处理能力的设备通常根据该任意具备数据处理能力的设备的实际功能，还可以包括其他硬件，对此不再赘述。The embodiments of the unknown threat defense device for OOXML document template injection attacks of the present invention can be applied to any device with data processing capabilities, and any device with data processing capabilities can be a device or device such as a computer. The device embodiments may be implemented by software, or may be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running them through the processor of any device with data processing capabilities. From the hardware level, as shown in Figure 5, it is a hardware structure diagram of any device with data processing capabilities where the unknown threat defense device for the OOXML document template injection attack of the present invention is located. In addition to the processor and memory shown in Figure 5 , network interfaces, and non-volatile memory, any device with data processing capabilities where the device in the embodiment is located may also include other hardware based on the actual functions of any device with data processing capabilities. This will not be discussed here. Repeat.

上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程，在此不再赘述。For details on the implementation process of the functions and effects of each unit in the above device, please refer to the implementation process of the corresponding steps in the above method, and will not be described again here.

对于装置实施例而言，由于其基本对应于方法实施例，所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本发明方案的目的。本领域普通技术人员在不付出创造性劳动的情况下，即可以理解并实施。As for the device embodiment, since it basically corresponds to the method embodiment, please refer to the partial description of the method embodiment for relevant details. The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present invention. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

与前述OOXML文档模板注入攻击的未知威胁防御方法的实施例相对应，本申请实施例还提供一种电子设备，包括：一个或多个处理器；存储器，用于存储一个或多个程序；当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现如上述的OOXML文档模板注入攻击的未知威胁防御方法。如图6所示，为本申请实施例提供的OOXML文档模板注入攻击的未知威胁防御方法所在任意具备数据处理能力的设备的一种硬件结构图，除了图6所示的处理器、内存、DMA控制器、磁盘、以及非易失内存之外，实施例中装置所在的任意具备数据处理能力的设备通常根据该任意具备数据处理能力的设备的实际功能，还可以包括其他硬件，对此不再赘述。Corresponding to the embodiments of the unknown threat defense method for OOXML document template injection attacks, embodiments of the present application also provide an electronic device, including: one or more processors; a memory for storing one or more programs; The one or more programs are executed by the one or more processors, so that the one or more processors implement the unknown threat defense method of OOXML document template injection attack as described above. As shown in Figure 6, a hardware structure diagram of any device with data processing capabilities is used for the unknown threat defense method of OOXML document template injection attack provided by the embodiment of the present application. In addition to the processor, memory, and DMA shown in Figure 6 In addition to controllers, disks, and non-volatile memories, any device with data processing capabilities where the device in the embodiment is located may also include other hardware based on the actual functions of any device with data processing capabilities, which will not be discussed here. Repeat.

与前述OOXML文档模板注入攻击的未知威胁防御方法的实施例相对应，本发明实施例还提供一种计算机可读存储介质，其上存储有程序，该程序被处理器执行时，实现上述实施例中的OOXML文档模板注入攻击的未知威胁防御方法。Corresponding to the foregoing embodiments of the unknown threat defense method for OOXML document template injection attacks, embodiments of the present invention also provide a computer-readable storage medium on which a program is stored. When the program is executed by a processor, the above embodiments are implemented Unknown threat defense method for OOXML document template injection attacks.

所述计算机可读存储介质可以是前述任一实施例所述的任意具备数据处理能力的设备的内部存储单元，例如硬盘或内存。所述计算机可读存储介质也可以是任意具备数据处理能力的设备，例如所述设备上配备的插接式硬盘、智能存储卡（Smart Media Card，SMC）、SD卡、闪存卡（Flash Card）等。进一步的，所述计算机可读存储介质还可以既包括任意具备数据处理能力的设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述任意具备数据处理能力的设备所需的其他程序和数据，还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of any device with data processing capabilities as described in any of the foregoing embodiments, such as a hard disk or a memory. The computer-readable storage medium can also be any device with data processing capabilities, such as a plug-in hard disk, a smart media card (SMC), an SD card, and a flash card (Flash Card) equipped on the device. wait. Furthermore, the computer-readable storage medium may also include both an internal storage unit and an external storage device of any device with data processing capabilities. The computer-readable storage medium is used to store the computer program and other programs and data required by any device with data processing capabilities, and can also be used to temporarily store data that has been output or is to be output.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明保护的范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the present invention. within the scope of protection.

以上实施例仅用于说明本发明的设计思想和特点，其目的在于使本领域内的技术人员能够了解本发明的内容并据以实施，本发明的保护范围不限于上述实施例。所以，凡依据本发明所揭示的原理、设计思路所作的等同变化或修饰，均在本发明的保护范围之内。The above embodiments are only used to illustrate the design ideas and features of the present invention, and their purpose is to enable those skilled in the art to understand the content of the present invention and implement it accordingly. The protection scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes or modifications made based on the principles and design ideas disclosed in the present invention are within the protection scope of the present invention.

Claims

Translated fromChinese

1.一种OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，包括以下步骤：1. An unknown threat defense method for OOXML document template injection attacks, which is characterized by including the following steps:

2.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述步骤（1）中，威胁地图采用黑名单形式，以列表形式存储规定的可能携带攻击载荷的文件；威胁地图内置了不同格式所使用的可信度字典，根据待处理的OOXML文档的格式初始化威胁地图；可信度字典定义了可能存在威胁风险的标签集合，标签集合中的每个标签内定义了初始可信度评分、各标签值的匹配规则，以及定义了基于长度或匹配次数的评分标准。2. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that, in the step (1), the threat map adopts the form of a blacklist, and stores specified threats that may carry attack payloads in the form of a list. file; the threat map has built-in credibility dictionaries used in different formats, and the threat map is initialized according to the format of the OOXML document to be processed; the credibility dictionary defines a set of tags that may contain threat risks, and each tag in the tag set contains An initial confidence score, matching rules for each tag value, and scoring criteria based on length or number of matches are defined.

3.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述步骤（2）中，压缩算法基于ECMA-376 Office Open XML格式标准。3. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that in step (2), the compression algorithm is based on the ECMA-376 Office Open XML format standard.

4.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述步骤（3）中，对所述各类组件文件进行结构识别包括结构识别和名称识别。4. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that, in the step (3), performing structure identification on the various types of component files includes structure identification and name identification.

5.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述步骤（5）中，所述匹配规则包括正则表达式匹配和字符串匹配。5. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that in step (5), the matching rules include regular expression matching and string matching.

6.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述步骤（5）中，所述威胁评分由初始XML标签的可信度评分与威胁分值相加得到。6. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that in the step (5), the threat score is determined by comparing the credibility score of the initial XML tag with the threat score. Added.

7.根据权利要求1所述的OOXML文档模板注入攻击的未知威胁防御方法，其特征在于，所述阈值大于或等于60。7. The unknown threat defense method for OOXML document template injection attacks according to claim 1, characterized in that the threshold is greater than or equal to 60.

8.一种OOXML文档模板注入攻击的未知威胁防御装置，其特征在于，包括一个或多个处理器，用于实现权利要求1-7中任一项所述的OOXML文档模板注入攻击的未知威胁防御方法。8. An unknown threat defense device against an OOXML document template injection attack, characterized in that it includes one or more processors for realizing the unknown threat against an OOXML document template injection attack according to any one of claims 1-7. Defense methods.

9.一种电子设备，包括存储器和处理器，其特征在于，所述存储器与所述处理器耦接；其中，所述存储器用于存储程序数据，所述处理器用于执行所述程序数据以实现上述权利要求1-7任一项所述的OOXML文档模板注入攻击的未知威胁防御方法。9. An electronic device, comprising a memory and a processor, characterized in that the memory is coupled to the processor; wherein the memory is used to store program data, and the processor is used to execute the program data to Implement the unknown threat defense method of OOXML document template injection attack described in any one of the above claims 1-7.

10.一种计算机可读存储介质，其上存储有计算机程序，其特征在于，所述程序被处理器执行时实现如权利要求1-7任一项所述的OOXML文档模板注入攻击的未知威胁防御方法。10. A computer-readable storage medium with a computer program stored thereon, characterized in that, when the program is executed by a processor, the unknown threat of OOXML document template injection attack as described in any one of claims 1-7 is realized Defense methods.