CN106066896A

Movatterモバイル変換

Info

Publication number: CN106066896A
Application number: CN201610561768.5A
Authority: CN
Inventors: 付印金; 谢钧; 陈卫卫; 缪嘉嘉; 赵洪华; 端义锋
Original assignee: PLA University of Science and Technology
Current assignee: PLA University of Science and Technology
Priority date: 2016-07-15
Filing date: 2016-07-15
Publication date: 2016-11-02
Anticipated expiration: 2036-07-15
Also published as: CN106066896B

Abstract

Translated fromChinese

本发明公开了一种应用感知的大数据重复删除存储系统及方法。该系统包括基于计算机网络互联的客户端、管理服务器和重删节点，客户端与管理服务器之间通过应用感知路由方法为应用文件存储确定重删节点列表，客户端与重删节点之间通过相似感知路由方法从重删节点列表中确定目标重删节点，用于存储应用文件中的超块。通过该系统对应用文件进行存储管理，能够使得具有相同类型的应用文件存放到相同的重删节点上，不仅有效减轻网络通信的负荷量，还有利于提高重删率，以及在每个重删节点上的存储吞吐率，增强网络的可扩展性。

The invention discloses an application-aware large data deduplication storage system and method. The system includes a client, a management server, and a deduplication node based on computer network interconnection. The client and the management server determine the deduplication node list for application file storage through an application-aware routing method. The client and the deduplication node pass similar The perceptual routing method determines the target deduplication node from the deduplication node list, which is used to store the superblock in the application file. Through the storage management of application files through this system, application files of the same type can be stored on the same deduplication node, which not only effectively reduces the load of network communication, but also helps to improve the deduplication rate, and in each deduplication node The storage throughput rate on the point enhances the scalability of the network.

Description

Translated fromChinese

一种应用感知的大数据重复删除存储系统及方法An application-aware big data deduplication storage system and method

技术领域technical field

本发明涉及计算机数据存储管理领域，特别是涉及一种基于云计算环境中的应用感知的大数据重复删除存储系统及方法。The invention relates to the field of computer data storage management, in particular to a large data deduplication storage system and method based on application perception in a cloud computing environment.

背景技术Background technique

在数字世界中，数据容量和复杂度爆炸式增长。国际数据公司IDC(InternationalData Corporation)研究表明：在过去的五年里，数据年增长量翻9倍达到7ZB，并且在未来的十年内，将增长44倍达到35ZB。企业的数字信息量也很容易上升到PB级甚至EB级。由于大数据时代数据量的不断增长使得管理变得越来越复杂，数据管理成本和数据丢失风险提高。随着存储系统的不断扩展，不仅需要消耗数据中心更多的数据存储空间、能耗和制冷代价，也需要增加大量的管理时间和提高操作复杂度和人为出错风险。同时，由于现代存储系统对高性能的需求，内存正在取代磁盘，而磁盘在取代磁带。为满足大数据管理所需的服务级别协议(Service Level Agreement，SLA)，管理存储介质变化下的数据泛滥成了新的挑战。In the digital world, data volume and complexity are exploding. International Data Corporation IDC (International Data Corporation) research shows: In the past five years, the annual growth of data has increased by 9 times to 7ZB, and in the next ten years, it will increase by 44 times to 35ZB. The amount of digital information in an enterprise can easily rise to PB level or even EB level. Due to the continuous growth of data volume in the era of big data, management has become more and more complex, and data management costs and data loss risks have increased. With the continuous expansion of the storage system, it not only needs to consume more data storage space, energy consumption and cooling costs of the data center, but also needs to increase a lot of management time and increase the complexity of operations and the risk of human error. At the same time, due to the high performance requirements of modern storage systems, memory is replacing disk, and disk is replacing tape. In order to meet the Service Level Agreement (Service Level Agreement, SLA) required for big data management, managing data flooding under changing storage media has become a new challenge.

重复数据删除(简称重删)存储技术被广泛应用于磁盘存储系统来管理海量的备份、归档文件和虚拟机镜像等数据，利用存储数据集中高度数据冗余的特性来节省存储容量需求和提高网络带宽利用率。为满足大数据存储在容量和性能方面的可扩展需求，基于云计算环境的分布式重复数据删除存储系统被应用于海量存储数据集的管理，以获得高重删压缩比和高重删吞吐率。分布式重复数据删除存储系统通常具有将应用数据从客户端分配到多个重复数据删除服务器节点(简称重删节点)的数据路由机制，以及在每一个重删节点内进行独立重复数据删除和存储的过程。为及时删除重复数据、优化数据存储和传输开销，需要在重删存储系统设计中选择在线重删机制。Data deduplication (referred to as deduplication) storage technology is widely used in disk storage systems to manage massive backups, archive files and virtual machine images, etc., and uses the characteristics of high data redundancy in storage data sets to save storage capacity requirements and improve network performance. bandwidth utilization. In order to meet the scalability requirements of big data storage in terms of capacity and performance, the distributed deduplication storage system based on cloud computing environment is applied to the management of massive storage data sets to obtain high deduplication compression ratio and high deduplication throughput rate . A distributed deduplication storage system usually has a data routing mechanism that distributes application data from the client to multiple deduplication server nodes (referred to as deduplication nodes), and performs independent deduplication and storage in each deduplication node. the process of. In order to delete duplicate data in time and optimize data storage and transmission costs, it is necessary to select an online deduplication mechanism in the deduplication storage system design.

针对大规模存储系统，在数据块级的在线分布式重复数据删除存储方面存在以下两大挑战：For large-scale storage systems, there are two major challenges in block-level online distributed deduplication storage:

第一，重复数据删除服务器节点的信息孤岛：在分布式重复数据删除过程中，出于对系统开销的考虑，往往只对节点内部的数据进行消重，而不会去执行跨节点的重复数据删除，由此产生重复数据删除服务器节点信息孤岛。因此，一种能将数据冗余集中在节点内部，减少节点间数据重叠，维持系统通信低开销，同时支持负载均衡的数据路由机制对分布式重复数据删除至关重要。First, the information island of the deduplication server node: In the process of distributed deduplication, due to the consideration of system overhead, only the data inside the node is deduplicated, and the data duplication across nodes is not performed. Deduplication, resulting in deduplication server node information islands. Therefore, a data routing mechanism that can concentrate data redundancy within nodes, reduce data overlap between nodes, maintain low system communication overhead, and support load balancing is crucial to distributed deduplication.

第二，数据块索引查询磁盘瓶颈：为支持重删节点内部的重复数据删除和存储，磁盘上需要保存一个数据块索引来建立从数据块指纹到数据块存储地址的映射，但数据块索引通常太大以致难以适应重复数据删除服务器节点有限的内存，再加上频繁随机地对磁盘进行访问以获取数据块的索引，造成来自客户端的多个数据流的并行重删的特性严重下降。因此，数据块索引查询磁盘瓶颈成为近年来重复数据删除系统研究的热点。Second, data block index query disk bottleneck: In order to support deduplication and storage within the deduplication node, a data block index needs to be saved on the disk to establish a mapping from the data block fingerprint to the data block storage address, but the data block index usually Too large to fit in the limited memory of the deduplication server node, coupled with frequent random disk accesses to obtain the index of data blocks, resulting in a serious degradation of the parallel deduplication characteristics of multiple data streams from the client. Therefore, the data block index query disk bottleneck has become a research hotspot in deduplication systems in recent years.

另外，在传统的数据存储系统架构中包括三个层级：应用层、文件系统层和存储硬件层，每一层各自包含不同类型的有关该层所管理数据的信息，每一层的这些信息通常不会被其他层获得。因此，为了优化重删，综合考虑存储和应用进行协同设计是可取的。这样，在低级存储层就会有对高级应用层的数据结构和访问特性有深入的了解。现有技术中，对重复数据的删除没有涉及到对具体应用文件的内容和模式，并且不能找到文件中的冗余成分。In addition, there are three layers in the traditional data storage system architecture: application layer, file system layer, and storage hardware layer. Each layer contains different types of information about the data managed by the layer. The information in each layer is usually Not acquired by other layers. Therefore, in order to optimize deduplication, it is advisable to comprehensively consider storage and application for collaborative design. In this way, the low-level storage layer has a deep understanding of the data structure and access characteristics of the high-level application layer. In the prior art, the deletion of duplicate data does not involve the content and mode of the specific application file, and redundant components in the file cannot be found.

为此，本发明将针对大规模分布式重删系统进行设计，这种系统通常包括数以千计的云环境下的存储服务器节点。若是按照现有技术中传统分布式处理方法将难以实现，这是由于这些方法没有很好地开发应用层信息和数据相似性特征，在整体重删率、单一重删节点的吞吐率、可扩展性、通信开销等方面存在不足。Therefore, the present invention will be designed for a large-scale distributed deduplication system, which usually includes thousands of storage server nodes in a cloud environment. According to the traditional distributed processing method in the existing technology, it will be difficult to realize. This is because these methods have not well developed the application layer information and data similarity characteristics. In terms of overall deduplication rate, throughput of a single deduplication node, and scalability There are deficiencies in terms of performance and communication overhead.

发明内容Contents of the invention

本发明主要解决的技术问题是提供一种基于应用感知的大数据重复删除存储系统及方法，解决现有技术中没有根据应用文件类型进行重删存储，系统网络的工作负载量过大、扩展性不强，以及重删节点的吞吐率不高等问题。The main technical problem to be solved by the present invention is to provide an application-aware big data deduplication storage system and method, which solves the problem that the prior art does not perform deduplication storage according to the application file type, the workload of the system network is too large, and the scalability Not strong, and the throughput rate of deduplication nodes is not high.

为解决上述技术问题，本发明采用的一个技术方案是：提供一种基于应用感知的大数据重复删除存储系统，包括基于计算机网络互联的客户端、管理服务器和重删节点，该客户端包括数据划分模块、指纹计算模块和相似感知数据路由模块，该数据划分模块用于对应用文件按照固定长度或可变长度划分为数据块，再进一步将该数据块组合成超块；该指纹计算模块利用抗冲突加密哈希函数计算该超块中的每一个数据块的指纹，得到该超块对应的各数据块的指纹列表；该相似感知数据路由模块利用相似感知路由方法为该超块确定一个用于存储该超块的目标重删节点；该管理服务器包括文件会话管理模块和应用感知路由决策模块，该文件会话管理模块用于存储该应用文件与该应用文件划分的数据块的指纹之间的映射关系，以及需要重构该应用文件的元数据信息；该应用感知路由决策模块与该相似感知数据路由模块之间通过应用感知路由方法，为该应用文件确定一组用于存储该应用文件的超块的重删节点列表，并反馈给该客户端；该重删节点包括应用感知相似索引查询模块、块指纹缓存模块以及并行容器管理模块，该应用感知相似索引查询模块用于向该客户端反馈应用感知相似性索引查询结果，该块指纹缓存模块用于缓存最近频繁访问的数据块的指纹，以加速对数据块的查询处理，该并行容器管理模块用于以并行处理的方式存储唯一数据块。In order to solve the above technical problems, a technical solution adopted by the present invention is to provide a large data deduplication storage system based on application awareness, including a client, a management server and a deduplication node based on computer network interconnection, the client includes data A division module, a fingerprint calculation module and a similar perception data routing module, the data division module is used to divide the application file into data blocks according to fixed length or variable length, and then further combine the data blocks into super blocks; the fingerprint calculation module utilizes The anti-collision encrypted hash function calculates the fingerprint of each data block in the super block, and obtains the fingerprint list of each data block corresponding to the super block; the similarity-aware data routing module uses the similarity-aware routing method to determine a The target deduplication node for storing the super block; the management server includes a file session management module and an application-aware routing decision module, and the file session management module is used to store the fingerprint between the application file and the data block divided by the application file mapping relationship, and the metadata information that needs to be reconstructed for the application file; the application-aware routing decision-making module and the similar-aware data routing module determine a set of storage locations for the application file through the application-aware routing method The deduplication node list of the super block is fed back to the client; the deduplication node includes an application-aware similarity index query module, a block fingerprint cache module and a parallel container management module, and the application-aware similarity index query module is used to send the client Feedback application-aware similarity index query results. The block fingerprint cache module is used to cache the fingerprints of recently frequently accessed data blocks to speed up the query processing of data blocks. The parallel container management module is used to store unique data in parallel processing piece.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该抗冲突加密哈希函数包括MD5、SHA-1和/或SHA-2函数。In another embodiment of the application-aware large data deduplication storage system of the present invention, the anti-collision cryptographic hash function includes MD5, SHA-1 and/or SHA-2 functions.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该管理服务器中设置有应用路由表，该应用路由表包括应用文件类型与对应的重删节点识别号、对应的重删节点的容量之间的映射关系。In another embodiment of the application-aware big data deduplication storage system of the present invention, the management server is provided with an application routing table, and the application routing table includes the application file type and the corresponding deduplication node identification number, the corresponding deduplication node The mapping relationship between the capacity.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该重删节点内存中设置有应用感知相似索引表和块指纹缓存，在磁盘阵列中设置有容器，该应用感知相似索引表由应用文件类型索引和基于应用文件类型分类的哈希表组成，该容器包含存储唯一数据块的数据段部分和存储相应唯一数据块元信息的元数据段部分，该块指纹缓存用于保存最近访问的该容器内的所有数据块指纹，以加速对该容器内的数据块指纹的查询操作。In another embodiment of the application-aware big data deduplication storage system of the present invention, an application-aware similarity index table and a block fingerprint cache are set in the memory of the deduplication node, and a container is set in the disk array, and the application-aware similarity index table It consists of an application file type index and a hash table classified based on the application file type. This container contains a data segment part for storing unique data blocks and a metadata segment part for storing metadata information of the corresponding unique data block. The block fingerprint cache is used to save the latest All the data block fingerprints in the container are accessed to speed up the query operation of the data block fingerprints in the container.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该基于应用文件类型分类的哈希表中的每一项均包括超块的代表性数据块指纹与存储该超块的容器识别号之间的映射关系，该块指纹缓存为Key-Value型结构，由一个双链表索引的哈希表构建而成。In another embodiment of the application-aware big data deduplication storage system of the present invention, each item in the hash table classified based on the application file type includes the representative data block fingerprint of the super-block and the container storing the super-block The mapping relationship between identification numbers, the block fingerprint cache is a Key-Value structure, which is constructed by a hash table indexed by a double-linked list.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该应用感知路由方法是：In another embodiment of the application-aware big data deduplication storage system of the present invention, the application-aware routing method is:

第一步，在该管理服务器中，对来自该客户端的需要存储的应用文件，确定该应用文件的扩展名；In the first step, in the management server, for the application file to be stored from the client, determine the extension of the application file;

第二步，在该管理服务器的该应用路由表中查询，找到该应用文件的扩展名对应的重删节点A_i，该重删节点A_i存储相同类型的应用文件；The second step is to query in the application routing table of the management server to find the deduplication node A_i corresponding to the extension of the application file, and the deduplication node A_i stores the same type of application file;

第三步，将所有存储该相同类型应用文件的重删节点组成一个重删节点列表ID_list＝{A₁,A₂,…,A_m}，且满足该该{S₁,S₂,…,S_N}表示该应用感知的大数据重复删除存储系统中的所有的重删节点的列表；The third step is to form a deduplication node list ID_list={A₁ ,A₂ ,…,A_m } with all deduplication nodes storing the same type of application files, and satisfy the The {S₁ , S₂ ,...,S_N } represents a list of all deduplication nodes in the application-aware big data deduplication storage system;

第四步，检查该重删节点列表ID_list，若该或者该ID_list中的各重删节点均已存满，则向该ID_list仅添加一个具有最少存储负载的重删节点S_L，即ID_list＝{S_L}；The fourth step is to check the deduplication node list ID_list, if the Or each deduplication node in the ID_list is full, then add only one deduplication node_{SL with the least storage load to the ID_list, that is, ID_list={S L}_} ;

第五步，该管理服务器向该客户端返回该应用文件对应的该重删节点列表ID_list。In the fifth step, the management server returns the deduplication node list ID_list corresponding to the application file to the client.

在本发明应用感知的大数据重复删除存储系统另一个实施例中，该相似感知路由方法是：In another embodiment of the application-aware big data deduplication storage system of the present invention, the similarity-aware routing method is:

第一步，在该客户端，利用该数据划分模块对应用文件划分为c个数据块，再将这c个数据块组合成一个超块S，通过加密哈希函数计算得到该超块S的所有c个数据块指纹列表{fp₁,fp₂,…,fp_c}；In the first step, on the client, use the data division module to divide the application file into c data blocks, then combine the c data blocks into a super block S, and calculate the super block S by encrypting the hash function All c data block fingerprint list {fp₁ ,fp₂ ,...,fp_c };

第二步，对该数据块指纹列表{fp₁,fp₂,…,fp_c}进行排序，选取k个最小的数据块指纹作为该超块S的代表性数据块指纹，即该超块S的手纹；The second step is to sort the data block fingerprint list {fp₁ ,fp₂ ,...,fp_c }, and select the k smallest data block fingerprints As the representative data block fingerprint of the super-block S, that is, the handprint of the super-block S;

第三步，从该应用感知的大数据重复删除存储系统中的所有N个重删节点中选取k个候选重删节点，该k个候选重删节点的识别号为{rfp₁mod N,rfp₂mod N,…,rfp_k mod N}，将该超块S的手纹分别发送给该k个候选重删节点；The third step is to select k candidate deduplication nodes from all N deduplication nodes in the application-aware big data deduplication storage system, and the identification numbers of the k candidate deduplication nodes are {rfp₁ mod N,rfp₂ mod N,...,rfp_k mod N}, send the handprint of the superblock S to the k candidate deduplication nodes respectively;

第三步，在该k个候选重删节点上，均查询该超块S手纹中各代表性数据块指纹是否存在，从而得到该k候选重删节点内已存的代表性数据块指纹数{r₁,r₂,…,r_k}，该代表性数据块指纹数{r₁,r₂,…,r_k}作为该k候选重删节点对该超块S的相似值；In the third step, on the k candidate deduplication nodes, query whether the fingerprints of each representative data block in the super-block S handprint exist, so as to obtain the number of representative data block fingerprints stored in the k candidate deduplication nodes {r₁ ,r₂ ,…,r_k }, the representative data block fingerprint number {r₁ ,r₂ ,…,r_k } is used as the similarity value of the k candidate deduplication node to the super block S;

第四步，将该k个候选重删节点存储使用率与所有N个重删节点的平均存储使用率相除，计算该k个候选重删节点的相对存储使用率{w₁,w₂,…,w_k}；The fourth step is to divide the storage usage ratio of the k candidate deduplication nodes by the average storage usage ratio of all N deduplication nodes, and calculate the relative storage usage ratio of the k candidate deduplication nodes {w₁ , w₂ , ..., w_k };

第五步，选择在该k个候选重删节点中，选取识别号i满足r_i/w_i＝max{r₁/w₁,r₂/w₂,…,r_k/w_k}的重删节点作为该超块S存储的目标重删节点。The fifth step is to select among the k candidate deduplication nodes, select the deduplication node whose identification number i satisfies r_i /w_i =max{r₁ /w₁ ,r₂ /w₂ ,…,r_k /w_k } The delete node is used as the target deduplication node where the superblock S is stored.

本发明还提供了一种应用感知的大数据重复删除方法，基于前述应用感知的大数据重复删除存储系统对应用文件进行存储和获取，对应用文件的存储方法是：The present invention also provides an application-aware big data deduplication method. Based on the aforementioned application-aware big data deduplication storage system, the application files are stored and acquired. The storage method for the application files is as follows:

第一步，该客户端向该管理服务器发送在该应用感知的大数据重复删除存储系统中存储一个应用文件的PutFileReq消息，该PutFileReq消息包含该应用文件的元数据；In a first step, the client sends a PutFileReq message for storing an application file in the application-aware big data deduplication storage system to the management server, and the PutFileReq message includes metadata of the application file;

第二步，该管理服务器接收该PutFileReq消息后，存储该应用文件的元数据，并且确认该应用感知的大数据重复删除存储系统中有足够的存储空间来存储该应用文件；接着，该管理服务器利用该应用感知路由方法，为该应用文件确定一组用于存储该应用文件的重删节点列表；然后，该管理服务器向该客户端发回一个PutFileResp消息，在该PutFileResp消息中包含该管理服务器向该客户端回应的该应用文件识别号，以及该应用文件对应的重删节点列表；In the second step, after receiving the PutFileReq message, the management server stores the metadata of the application file, and confirms that there is enough storage space in the application-aware big data deduplication storage system to store the application file; then, the management server Using the application-aware routing method, a set of deduplication node lists for storing the application file is determined for the application file; then, the management server sends back a PutFileResp message to the client, and the PutFileResp message includes the management server The application file identification number responded to the client, and the deduplication node list corresponding to the application file;

第三步，该客户端接收PutFileResp消息后，对于该应用文件中的第1个超块SuperChunk_1，利用该相似感知路由方法，分别向k个候选重删节点发出LookupSCReq查询请求，该k个候选重删节点分别对超块SuperChunk_1的代表性指纹进行相似索引查找，确定对应的加权相似值，然后向该用户端回应LookupSCResp消息，该LookupSCResp消息包含了该候选重删节点向该客户端回应的与该超块SuperChunk_1对应的加权相似值；Step 3: After receiving the PutFileResp message, the client sends a LookupSCReq query request to k candidate deduplication nodes for the first superchunk SuperChunk_1 in the application file using the similarity-aware routing method. The deletion node performs a similarity index search on the representative fingerprint of the superblock SuperChunk_1, determines the corresponding weighted similarity value, and then responds to the client with a LookupSCResp message, which contains the response from the candidate deduplication node to the client. The weighted similarity value corresponding to the superblock SuperChunk_1;

第四步，该客户端接收该k个候选重删节点各自反馈的LookupSCResp消息后，根据各该加权相似值的大小，选择其中加权相似值最大的一个候选重删节点作为目标重删节点，用以存储该超块SuperChunk_1，并且将该目标重删节点的识别号通过PutSCReq消息通知该管理服务器，该管理服务器向该客户端回应PutSCResp消息；Step 4: After receiving the LookupSCResp messages fed back by the k candidate deduplication nodes, the client selects a candidate deduplication node with the largest weighted similarity value as the target deduplication node according to the size of each weighted similarity value. To store the SuperChunk_1, and notify the management server of the identification number of the target deduplication node through the PutSCReq message, and the management server responds to the client with the PutSCResp message;

第五步，该客户端向该目标重删节点发出LookupChunks-SCReq请求信息，并以批量方式发出该超块SuperChunk_1的全部数据块指纹到该目标重删节点，用以确认该数据块在该目标重删节点中是否重复，该目标重删节点进行数据块指纹查找后，向该客户端返回LookupChunksSCResp消息，该LookupChunksSCResp消息中包含该超块SuperChunk_1在该目标重删节点不重复的唯一数据块的列表；Step 5: The client sends a LookupChunks-SCReq request message to the target deduplication node, and sends all the data block fingerprints of the superblock SuperChunk_1 to the target deduplication node in batches to confirm that the data block is in the target deduplication node. Whether the deduplication node is duplicated, the target deduplication node returns a LookupChunksSCResp message to the client after the data block fingerprint search, and the LookupChunksSCResp message contains the list of unique data chunks of the SuperChunk_1 that are not duplicated in the target deduplication node ;

第六步，该客户端向该目标重删节点发出UniqueChunksSC消息，并以批量方式将该超块SuperChunk_1中的唯一数据块发送到该目标重删节点，该目标重删节点接收该唯一数据块并存储，然后向该管理服务器返回SCAck确认消息；In the sixth step, the client sends a UniqueChunksSC message to the target deduplication node, and sends the unique data chunk in the superblock SuperChunk_1 to the target deduplication node in batches, and the target deduplication node receives the unique data chunk and store, and then return a SCAck confirmation message to the management server;

第七步，对于该应用文件的后续其他超块，重复以上第三步至第六步，直至该应用文件的所有超块完成存储，结束对该应用文件的存储。In the seventh step, for other subsequent superblocks of the application file, repeat the above steps from the third step to the sixth step until all superblocks of the application file are stored, and the storage of the application file ends.

在本发明应用感知的大数据重复删除方法另一个实施例中，对应用文件的获取方法是：In another embodiment of the application-aware big data deduplication method of the present invention, the method for obtaining application files is:

第一步，该客户端向该管理服务器发出GetFileReq请求，请求获取一个应用文件，该管理服务器响应该GetFileReq请求，通过查询该应用文件的元数据后，向该客户端反馈GetFileResp消息，该GetFileResp消息中包含被请求获取的该应用文件的超块列表，以及该超块列表中的超块到对应的重删节点的映射；In the first step, the client sends a GetFileReq request to the management server, requesting to obtain an application file, and the management server responds to the GetFileReq request, and after querying the metadata of the application file, feeds back a GetFileResp message to the client, and the GetFileResp message contains the superblock list of the application file requested to be obtained, and the mapping from the superblock in the superblock list to the corresponding deduplication node;

第二步，该客户端收到GetFileResp消息后，向该应用文件的超块列表中的超块对应的重删节点发出GetSuperChunk消息，请求该重删节点获取该超块；In the second step, after receiving the GetFileResp message, the client sends a GetSuperChunk message to the deduplication node corresponding to the super block in the super block list of the application file, requesting the deduplication node to obtain the super block;

第三步，重删节点接收GetSuperChunk消息，从其容器中获取对应的该超块，然后向该客户端返回该超块；Step 3: The deduplication node receives the GetSuperChunk message, obtains the corresponding superchunk from its container, and then returns the superchunk to the client;

第四步，该客户端接收该重删节点返回的该超块后，再利用该超块的验证码和需要获取的该应用文件的识别号来确认数据块的完整性，并完成对该应用文件的获取。Step 4: After receiving the super block returned by the deduplication node, the client uses the verification code of the super block and the identification number of the application file to be obtained to confirm the integrity of the data block, and completes the application Document acquisition.

本发明的有益效果是：通过本发明提供了一种应用感知的大数据重复删除存储系统和方法，包括基于计算机网络互联的客户端、管理服务器和重删节点，客户端与管理服务器之间通过应用感知路由方法为应用文件存储确定重删节点列表，客户端与重删节点之间通过相似感知路由方法从重删节点列表中确定目标重删节点，用于存储应用文件中的超块。这种两级路由方法能够有效降低网络的通信负荷，通过该系统对应用文件进行存储，能够使得具有相同类型的应用文件存储到相同的重删节点上，有利于提高重删率，以及在每个重删节点上的存储的吞吐率，增强网络的可扩展性。The beneficial effects of the present invention are: the present invention provides an application-aware large data deduplication storage system and method, including a client, a management server, and a deduplication node based on computer network interconnection, and the client and the management server are connected through The application-aware routing method determines the deduplication node list for application file storage, and the client and the deduplication node determine the target deduplication node from the deduplication node list through the similarity-aware routing method, which is used to store the superblock in the application file. This two-level routing method can effectively reduce the communication load of the network. By storing application files through this system, application files of the same type can be stored on the same deduplication node, which is conducive to improving the deduplication rate and The storage throughput on each deduplication node enhances the scalability of the network.

附图说明Description of drawings

图1是根据本发明应用感知的大数据重复删除存储系统一个实施例的系统架构图；FIG. 1 is a system architecture diagram of an embodiment of an application-aware large data deduplication storage system according to the present invention;

图2是根据本发明应用感知的大数据重复删除存储系统一个实施例中的管理服务器的应用路由表；Fig. 2 is the application routing table of the management server in an embodiment of the application-aware big data deduplication storage system according to the present invention;

图3是根据本发明应用感知的大数据重复删除存储系统一个实施例中的重删节点的数据结构；FIG. 3 is a data structure of a deduplication node in an embodiment of an application-aware large data deduplication storage system according to the present invention;

图4是根据本发明应用感知的大数据重复删除方法一个实施例的应用文件存储流程图；4 is a flow chart of application file storage according to an embodiment of the application-aware big data deduplication method of the present invention;

图5是根据本发明应用感知的大数据重复删除方法一个实施例的应用文件获取流程图。Fig. 5 is a flow chart of obtaining application files according to an embodiment of the application-aware big data deduplication method of the present invention.

具体实施方式detailed description

为了便于理解本发明，下面结合附图和具体实施例，对本发明进行更详细的说明。附图中给出了本发明的较佳的实施例。但是，本发明可以以许多不同的形式来实现，并不限于本说明书所描述的实施例。相反地，提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。In order to facilitate the understanding of the present invention, the present invention will be described in more detail below in conjunction with the accompanying drawings and specific embodiments. Preferred embodiments of the invention are shown in the accompanying drawings. However, the present invention can be implemented in many different forms and is not limited to the embodiments described in this specification. On the contrary, these embodiments are provided to make the understanding of the disclosure of the present invention more thorough and comprehensive.

需要说明的是，除非另有定义，本说明书所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的，不是用于限制本发明。本说明书所使用的术语“和/或”包括一个或多个相关的所列项目的任意的和所有的组合。It should be noted that, unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by those skilled in the technical field of the present invention. Terms used in the description of the present invention are only for the purpose of describing specific embodiments, and are not used to limit the present invention. The term "and/or" used in this specification includes any and all combinations of one or more of the associated listed items.

图1显示了本发明应用感知的大数据重复删除存储系统一个实施例的系统架构图。该系统包括客户端10、管理服务器20和重删节点30，可见客户端10和重删节点30都有多个。在每一个客户端10中均包括3个模块，即数据划分模块101、指纹计算模块102和相似感知数据路由模块103。对于数据划分模块101，其作用就是要把需要存储的应用文件进行切分，即对文件级的数据以固定长度或可变长度划分为成一个个小的数据块，然后再进一步将这些小的数据块组合成数据超块(简称超块)；接下来，指纹计算模块102利用MD5、SHA-1和/或SHA-2等抗冲突加密哈希函数对该超块中的每一个数据块进行指纹计算，得到该超块中的各个数据块的块指纹列表；进一步，相似感知数据路由模块103利用相似感知路由方法为每一个超块选择一个适合的重删节点30用于存储该超块。该过程需要客户端10的相似感知数据路由模块103先与管理服务器20的应用感知路由模块之间通过应用感知路由方法，为每一个需要存储的应用文件提供一个用于存储该应用文件的超块的重删节点列表。然后，客户端10的相似感知数据路由模块103再利用上述的相似感知路由方法，从该重删节点列表中为该应用文件划分出的每一个超块找到适合的目标重删节点30，用于存储该超块。下文会对应用感知路由方法和相似感知路由方法做进一步示例说明。Fig. 1 shows a system architecture diagram of an embodiment of the application-aware big data deduplication storage system of the present invention. The system includes a client 10, a management server 20, and a deduplication node 30. It can be seen that there are multiple clients 10 and deduplication nodes 30. Each client 10 includes three modules, namely a data division module 101 , a fingerprint calculation module 102 and a similarity sensing data routing module 103 . For the data division module 101, its role is to divide the application files that need to be stored, that is, divide the file-level data into small data blocks with fixed length or variable length, and then further divide these small data blocks The data block is combined into a data super block (referred to as a super block); next, the fingerprint calculation module 102 utilizes anti-collision encryption hash functions such as MD5, SHA-1 and/or SHA-2 to perform The fingerprint calculation is to obtain the block fingerprint list of each data block in the superblock; further, the similarity-aware data routing module 103 uses the similarity-aware routing method to select a suitable deduplication node 30 for each superblock to store the superblock. This process requires that the similarity-aware data routing module 103 of the client 10 first communicates with the application-aware routing module of the management server 20 through an application-aware routing method to provide a superblock for storing the application file for each application file that needs to be stored. The deduplication node list. Then, the similarity-aware data routing module 103 of the client 10 uses the above-mentioned similarity-aware routing method to find a suitable target deduplication node 30 for each superblock divided by the application file from the deduplication node list, for Store the superblock. Further examples of the application-aware routing method and the similarity-aware routing method will be described below.

在图1中，管理服务器20包括文件会话管理模块201和应用感知路由决策模块202。其中，文件会话管理模块201用于保存需要存储的应用文件与该文件对应划分出的数据块的指纹之间的映射关系，以及需要重构该文件的元数据信息。因此，需要存储的应用文件的所有文件级的元数据都保留在该管理服务器20中，客户端10与管理服务器20之间存在通过文件元数据的读写操作进行交互的过程，下文将会结合图4和图5分别对文件存储和文件获取过程进行详细的示例说明。应用感知路由决策模块202与相似感知数据路由模块103之间通过应用感知路由方法，为应用文件确定一组用于存储该应用文件超块的重删节点列表，并反馈给所述客户端10。优选的，为保证可靠性，管理服务器20可以采用主被故障切换方式支持两个服务器，以避免单一服务器失效带来的不利影响。In FIG. 1 , the management server 20 includes a file session management module 201 and an application-aware routing decision module 202 . Wherein, the file session management module 201 is used to save the mapping relationship between the application file to be stored and the fingerprint of the correspondingly divided data block of the file, and the metadata information of the file to be reconstructed. Therefore, all file-level metadata of the application files that need to be stored are kept in the management server 20, and there is a process of interaction between the client 10 and the management server 20 through the read and write operations of the file metadata, which will be combined below Figure 4 and Figure 5 illustrate in detail the process of file storage and file acquisition respectively. The application-aware routing decision-making module 202 and the similarity-aware data routing module 103 determine a set of deduplication node lists for storing the superblock of the application file for the application file through the application-aware routing method, and feed it back to the client 10 . Preferably, in order to ensure reliability, the management server 20 can support two servers in a master-slave failover manner, so as to avoid adverse effects caused by failure of a single server.

优选的，图2显示了管理服务器20中的一个名为应用路由表的数据结构实施例。该应用路由表用于执行应用感知路由决策模块202的应用感知路由方法。该应用路由表位于管理服务器20的RAM中，确保了较高的处理速度。该应用路由表中的竖向第一列Type为文件类型，以应用文件的扩展名进行区分，分别有doc文件、rmvb文件、jpg文件等；表中竖向第二列为NodeID，为重删节点识别号，以数字序列号进行区分；表中竖向第三列为Capacity，为该重删节点上对应的应用文件数据的存储容量，以字节数来表示。横向每一行对应一条存储映射关系，例如，第一行表示在重删节点识别号NodeID为5的重删节点上对应存储的应用文件类型为doc，且该重删节点上所有doc文档的存储容量为235MB。通过该应用路由表实施例，管理服务器20可以对每一个需要存储的应用文件，首先查找其文件类型，然后确定存储该文件类型的重删节点有哪些，这些重删节点的容量有多大，从而计算出在这些重删节点中进行存储的工作负载量。Preferably, FIG. 2 shows an embodiment of a data structure named application routing table in the management server 20 . The application routing table is used to implement the application-aware routing method of the application-aware routing decision module 202 . The application routing table is located in the RAM of the management server 20, ensuring high processing speed. The Type in the first vertical column in the application routing table is the file type, which is distinguished by the extension of the application file, including doc files, rmvb files, and jpg files, etc.; the second vertical column in the table is NodeID, which is deduplicated Point the identification number, which is distinguished by a digital serial number; the third vertical column in the table is Capacity, which is the storage capacity of the corresponding application file data on the deduplication node, expressed in bytes. Each row in the horizontal direction corresponds to a storage mapping relationship. For example, the first row indicates that the application file type corresponding to storage on the deduplication node whose identification number NodeID is 5 is doc, and the storage capacity of all doc documents on the deduplication node 235MB. Through this embodiment of the application routing table, the management server 20 can first search for the file type of each application file that needs to be stored, and then determine which deduplication nodes store the file type, and how large the capacity of these deduplication nodes is, thereby Calculate the amount of workload that will be stored in these deduplication nodes.

进一步的，当从客户端10向管理服务器20输入需要存储的应用文件扩展名以后，管理服务器20通过应用感知路由方法得到该应用文件的一组重删节点列表。以下给出该应用感知路由方法的一个优选实施例。Further, after the extension of the application file to be stored is input from the client 10 to the management server 20, the management server 20 obtains a set of deduplication node lists of the application file through an application-aware routing method. A preferred embodiment of the application-aware routing method is given below.

首先，对来自客户端10的需要存储的应用文件fullname，确定该文件fullname的扩展名；然后，在管理服务器20中的应用路由表(如图2所示)中查询，找到该扩展名对应的重删节点A_i，该重删节点存储相同类型的应用文件(即文件的扩展名相同)；接着，将所有这些存储有相同类型应用文件的重删节点组成一个重删节点列表ID_list＝{A₁,A₂,…,A_m}。对于整个重复删除系统而言，若所有的重删节点列表为{S₁,S₂,…,S_N}，应满足进一步，检查重删节点列表ID_list：若(即列表为空，没有找到相应的重删节点)或者ID_list中的各重删节点都已存满(存储过负载)，那么仅向ID_list添加一个具有最少存储负载的重删节点S_L，即ID_list＝{S_L}；最后，管理服务器20向客户端10返回应用文件fullname对应的重删节点列表ID_list。进一步，在图1中，重删节点30包括应用感知相似索引查询模块301、块指纹缓存模块302以及并行容器管理模块303。应用感知相似索引查询模块301用于向客户端10反馈应用感知相似性索引查询结果。由上述可知，当客户端10从管理服务器20接收到需要存储的应用文件的重删节点列表以后，需要进一步对该应用文件划分的每一个超块确定一个存储的重删节点(即目标重删节点)，所采用的方法称之为相似性感知路由方法，将在后面进一步示例说明。当为一个超块确定好目标重删节点后，进一步在客户端10与重删节点30之间进行指纹查找，以确定该超块中的每一个数据块的指纹在该重删节点30中是否存在，而只有指纹不存在的数据块才会被作为唯一数据块，通过块转移存储到该重删节点30中。图1中的块指纹缓存模块302用于缓存最近频繁访问的超块中的各数据块的指纹，以加速对冗余数据块的查询处理。并行容器管理模块303则用于以并行处理的方式存储唯一数据块，以加速存储处理过程。另外，当重删节点30存储新的唯一数据块以后，还将该唯一数据块的元数据发送到管理服务器20进行更新。First, for the application file fullname that needs to be stored from the client 10, determine the extension of the file fullname; then, query in the application routing table (as shown in Figure 2 ) in the management server 20 to find the corresponding extension of the extension. Deduplication node A_i , the deduplication node stores the same type of application file (that is, the extension of the file is the same); then, all these deduplication nodes that store the same type of application file form a deduplication node list ID_list={A₁ ,A₂ ,...,A_m }. For the entire deduplication system, if the list of all deduplication nodes is {S₁ ,S₂ ,…,S_N }, it should satisfy Further, check the deduplication node list ID_list: if (that is, the list is empty, no corresponding deduplication node is found) or each deduplication node in the ID_list is full (storage overload), then only add a deduplication node SL with the least storage load to the_{ID_list} , that is ID_list={S_L }; finally, the management server 20 returns to the client 10 the deduplication node list ID_list corresponding to the application file fullname. Further, in FIG. 1 , the deduplication node 30 includes an application-aware similarity index query module 301 , a block fingerprint cache module 302 and a parallel container management module 303 . The application-aware similarity index query module 301 is configured to feed back an application-aware similarity index query result to the client 10 . As can be seen from the above, after the client 10 receives the deduplication node list of the application file to be stored from the management server 20, it is necessary to further determine a stored deduplication node (i.e., the target deduplication node) for each super block divided by the application file. point), the method adopted is called the similarity-aware routing method, which will be further illustrated in the following examples. After determining the target deduplication node for a super block, further perform a fingerprint search between the client 10 and the deduplication node 30 to determine whether the fingerprint of each data block in the super block is in the deduplication node 30 exists, and only data blocks whose fingerprints do not exist will be used as unique data blocks, and stored in the deduplication node 30 through block transfer. The block fingerprint caching module 302 in FIG. 1 is used to cache the fingerprints of each data block in the recently frequently accessed super-block, so as to speed up the query processing of redundant data blocks. The parallel container management module 303 is used to store unique data blocks in parallel processing to speed up the storage process. In addition, after the deduplication node 30 stores a new unique data block, it also sends the metadata of the unique data block to the management server 20 for updating.

优选的，以下对相似性感知路由方法的一个优选实施例说明。我们已知图1中客户端10中的数据划分模块101用于对应用文件进行划分。Preferably, a preferred embodiment of the similarity-aware routing method is described below. We know that the data division module 101 in the client 10 in FIG. 1 is used to divide application files.

第一步，把一个应用文件分成c小的数据块，再将这些数据块组合成一个超块S，通过加密哈希函数计算得到该超块的所有的块指纹列表{fp₁,fp₂,…,fp_c}。此处，fp是fingerprint的缩写。The first step is to divide an application file into c small data blocks, and then combine these data blocks into a super block S, and obtain all block fingerprint lists {fp₁ , fp₂ , ...,fp_c }. Here, fp is an abbreviation for fingerprint.

第二步，对超块S的数据块指纹列表{fp₁,fp₂,…,fp_c}进行排序后，选取k个最小的数据块指纹作为该超块S的代表性数据块指纹，即超块S的手纹，并将超块S的手纹发送到N个重删节点构成的集群内的k个候选重删节点，这些候选重删节点的识别号为{rfp₁mod N,rfp₂mod N,…,rfp_k mod N}；此处，rfp是representative fingerprint的缩写。In the second step, after sorting the data block fingerprint list {fp₁ , fp₂ ,...,fp_c } of the super block S, select the k smallest data block fingerprints As the representative data block fingerprint of the super-block S, that is, the handprint of the super-block S, and send the handprint of the super-block S to k candidate deduplication nodes in the cluster composed of N deduplication nodes, these candidate deduplication nodes The identification number of the deleted node is {rfp₁ mod N, rfp₂ mod N,..., rfp_k mod N}; here, rfp is the abbreviation of representative fingerprint.

第三步，通过在候选的重删节点中，对于已经存在的由超块的代表性数据指纹构建的相似索引中，查询该超块S的代表性数据块指纹是否存在，从而统计得出各个候选重删节点内已存的代表性数据块指纹数{r₁,r₂,…,r_k}，这个k值直接对应超块S与这些重删节点中已存数据的相似值；The third step is to query whether the representative data block fingerprint of the super block S exists in the existing similarity index constructed by the representative data fingerprint of the super block in the candidate deduplication node, so as to obtain the statistics of each The number of representative data block fingerprints {r₁ , r₂ ,...,r_k } stored in the candidate deduplication node, this k value directly corresponds to the similarity between the superblock S and the data stored in these deduplication nodes;

第四步，以候选重删节点的存储使用率与所有重删节点的平均存储使用率相除，来计算各个候选重删节点的相对存储使用率{w₁,w₂,…,w_k}，再利用这些相对存储使用率得到各候选重删节点的加权相似值{r₁/w₁,r₂/w₂,…,r_k/w_k}，从而实现候选重删节点存储容量负载平衡；The fourth step is to calculate the relative storage usage of each candidate deduplication node {w₁ ,w₂ ,…,w_k } by dividing the storage usage rate of candidate deduplication nodes with the average storage usage rate of all deduplication nodes , and then use these relative storage usage rates to obtain the weighted similarity value {r₁ /w₁ ,r₂ /w₂ ,…,r_k /w_k } of each candidate deduplication node, so as to realize the storage capacity load balance of candidate deduplication nodes ;

第五步，选择在候选重删节点中选取其识别号i满足r_i/w_i＝max{r₁/w₁,r₂/w₂,…,r_k/w_k}的重删节点作为超块S的路由目标重删节点。基于该实施例，图1中客户端10可以从重删节点30中为应用文件划分出来的每一超块找到一个合适的目标重删节点。The fifth step is to select the deduplication node whose identification number i satisfies r_i /w_i =max{r₁ /w₁ ,r₂ /w₂ ,…,r_k /w_k } among the candidate deduplication nodes as The routing target of superblock S deduplicates the node. Based on this embodiment, the client 10 in FIG. 1 can find a suitable target deduplication node from the deduplication node 30 for each superblock divided by the application file.

优选的，图3显示了重删节点30中的数据结构实施例。图3中，在RAM中设置有应用感知相似性索引表3011、块指纹缓存3021，在磁盘阵列中设置有容器3031。其中，应用感知相似性索引表3011是一个设置在内存中的数据结构，是由应用文件类型索引3011A和基于应用文件类型分类的哈希表3011B组成。当来自客户端10的一个相似索引查询请求时，重删节点30根据输入的超块的应用文件类型信息，通过应用文件类型索引3011A将该超块路由到具有相同应用文件类型的哈希表3011B。例如，在图3中，输入到重删节点30的超块的文件类型若是doc，则将该超块路由到具有doc类型的哈希表。同样，超块的文件类型若是jpg，则将该超块路由到具有jpg类型的哈希表。对于图3中哈希表3011B中的每一项均包括一个映射关系，即超块的代表性数据块指纹(RFP，representative fingerprint)与存储该超块的容器识别号(CID，container ID)之间的映射关系。由此可见，这种相似索引是一种基于哈希表的内存数据结构，每一项包含了从超块代表性数据块指纹到存储该超块的容器识别号的映射。由于超块手纹技术只需要非常低的取样率，其大小远小于传统数据块指纹到容器映射的磁盘索引，因此该方法的处理速度具有明显的优势。Preferably, FIG. 3 shows an embodiment of the data structure in the deduplication node 30. In FIG. 3 , an application-aware similarity index table 3011 and a block fingerprint cache 3021 are set in RAM, and a container 3031 is set in a disk array. Wherein, the application-aware similarity index table 3011 is a data structure set in memory, which is composed of an application file type index 3011A and a hash table 3011B classified based on application file types. When a similar index query request comes from the client 10, the deduplication node 30 routes the superblock to the hash table 3011B with the same application file type through the application file type index 3011A according to the application file type information of the input super block . For example, in FIG. 3, if the file type of the super block input to the deduplication node 30 is doc, then the super block is routed to the hash table with the doc type. Likewise, if the file type of the superblock is jpg, then the superblock is routed to a hash table with the jpg type. Each item in the hash table 3011B in FIG. 3 includes a mapping relationship, that is, the relationship between the representative data block fingerprint (RFP, representative fingerprint) of the super block and the container identification number (CID, container ID) storing the super block. mapping relationship between them. It can be seen that this similarity index is a memory data structure based on a hash table, and each item includes a mapping from the fingerprint of the representative data block of the superblock to the container identification number storing the superblock. Since the super-block fingerprinting technique requires only a very low sampling rate and its size is much smaller than the traditional disk index for data block fingerprint-to-container mapping, the processing speed of this method has obvious advantages.

图3中，容器3031是一种自描述的数据结构，存在于磁盘阵列中以保持局部性。容器3031包含存储唯一数据块的数据段部分和存储相应唯一数据块元信息(如数据块指纹、偏移和长度等)的元数据段部分。这里的重删节点30支持并行的容器管理，用以并发地分配、回收、读、写和可靠存储。对于并行数据存储，每个数据流保留一个打开的数据容器，如果容器填满则创建并打开新的容器。所有的磁盘访问是在容器粒度实现的。In FIG. 3, container 3031 is a self-describing data structure that exists in the disk array to maintain locality. The container 3031 includes a data segment part storing unique data blocks and a metadata segment part storing corresponding unique data block metadata (such as data block fingerprint, offset and length, etc.). The deduplication node 30 here supports parallel container management for concurrent allocation, recovery, reading, writing and reliable storage. For parallel data storage, each data stream keeps an open data container, and creates and opens new containers if the container fills up. All disk access is implemented at container granularity.

图3中，除了上述两种重要的数据结构，还有块指纹缓存3021，其对改进重删节点的性能起关键作用。工作中，它保存最近访问容器3031的所有块指纹到内存中。一旦代表性数据块指纹在一个相似索引查询请求中被访问到，相应数据块所在容器内的所有数据块指纹预取到块指纹缓存3021中来加速块指纹查询操作。In Fig. 3, in addition to the above two important data structures, there is also a block fingerprint cache 3021, which plays a key role in improving the performance of deduplication nodes. In operation, it saves all block fingerprints of recently accessed container 3031 in memory. Once the representative data block fingerprint is accessed in a similarity index query request, all the data block fingerprints in the container where the corresponding data block is located are prefetched into the block fingerprint cache 3021 to speed up the block fingerprint query operation.

优选的，块指纹缓存3021是一个Key-Value型的结构，由一个双链表索引的哈希表构建。块指纹缓存3021存满时，容器3031内的指纹对加速块指纹查询效果不显著，则采用最近最少访问策略来替换历史记录。因此，当查询超块的代表性数据块指纹时，先在块指纹缓存3021中查找对应超块映射的容器。如果容器的数据块指纹信息已经在块指纹缓存3021中，则比较该超块的数据块指纹和容器内元数据部分的块指纹；否则，我们从容器中预取其元数据部分包含的所有数据块指纹。最后，那些未找到的数据块指纹对应的数据块将被存储到一个打开未满的容器。通过容器管理保持数据块指纹缓存局部性，我们设计的相似性索引查询优化能够获得高吞吐率且仅使用很少的内存空间。Preferably, the block fingerprint cache 3021 is a Key-Value structure constructed by a hash table indexed by a double-linked list. When the block fingerprint cache 3021 is full, the fingerprints in the container 3031 have no significant effect on accelerating the block fingerprint query, and the least recent access policy is used to replace the historical records. Therefore, when querying the representative data block fingerprint of a super-block, the container corresponding to the super-block mapping is first searched in the block fingerprint cache 3021 . If the data block fingerprint information of the container is already in the block fingerprint cache 3021, then compare the data block fingerprint of the super block with the block fingerprint of the metadata part in the container; otherwise, we prefetch all the data contained in the metadata part from the container block fingerprint. Finally, the data blocks corresponding to those missing data block fingerprints will be stored in an open container that is not full. By keeping the locality of data block fingerprint cache through container management, the similarity index query optimization we designed can achieve high throughput and use only a small amount of memory space.

通过以上分析，我们可以看出，为了存储一个超块，本发明实施例采用了两层路由机制，即在客户端与管理服务器之间以文件级的方式，通过应用感知路由方法来确定应用文件存储所对应的重删节点列表，而在客户端与重删节点之间以超块级的方式，通过相似性感知路由方法来确定超块存储所对应的目标重删节点。由此可见，这两层路由机制，一方面在客户端与管理服务器之间是以文件级进行交互，有利于减少客户端与管理服务器之间的通信负荷，也降低了管理服务器的工作负荷。而在客户端与重删节点之间以超块级的方式进行交互，则有利增强客户端与重删节点之间进行数据存储的针对性，提高了网络工作效率。Through the above analysis, we can see that in order to store a super block, the embodiment of the present invention adopts a two-layer routing mechanism, that is, the application file is determined by the application-aware routing method between the client and the management server at the file level. The corresponding deduplication node list is stored, and the target deduplication node corresponding to the superblock storage is determined by the similarity-aware routing method between the client and the deduplication node in a super-block level manner. It can be seen that the two-layer routing mechanism, on the one hand, interacts between the client and the management server at the file level, which is conducive to reducing the communication load between the client and the management server, and also reduces the workload of the management server. Interacting between the client and the deduplication node at the superblock level will help enhance the pertinence of data storage between the client and the deduplication node, and improve network efficiency.

基于上述本发明应用感知的大数据重复删除存储系统的同一构思，本发明还提供了一种应用感知的大数据重复删除存储方法，以下结合图4和图5加以说明。Based on the same concept of the application-aware big data deduplication storage system of the present invention, the present invention also provides an application-aware big data deduplication storage method, which will be described below with reference to FIG. 4 and FIG. 5 .

基于图1所示的应用感知的大数据重复删除存储系统实施例，图4显示了应用文件在该应用感知的大数据重复删除存储系统中存储方法。Based on the embodiment of the application-aware big data deduplication storage system shown in FIG. 1 , FIG. 4 shows a method for storing application files in the application-aware big data deduplication storage system.

当一个用户或者应用发送文件存储请求到客户端。首先，客户端在数据划分模块中将文件划分为数据块或者超块，并在块指纹计算模块通过加密哈希函数计算出数据块指纹值；然后，由相似感知数据路由模块提取出超块手纹并进行数据路由。实际的数据存储操作是在相似感知数据路由模块中通过与管理服务器和重删节点之间交互完成的。详细的方法步骤如下：When a user or application sends a file storage request to the client. First, the client divides the file into data blocks or super blocks in the data division module, and calculates the fingerprint value of the data block through the encrypted hash function in the block fingerprint calculation module; then, the similarity sensing data routing module extracts the super block fingerprint value pattern and data routing. The actual data storage operation is completed by interacting with the management server and the deduplication node in the similarity-aware data routing module. The detailed method steps are as follows:

第一步，客户端发送一个PutFileReq消息给管理服务器，该消息包含应用文件识别号、文件大小、文件名、时间戳、文件中的超块数、超块的校验等。其中，应用文件识别号为文件内容的抗冲突哈希值，也可用作文件获取过程中文件完整性的校验。In the first step, the client sends a PutFileReq message to the management server, which includes the application file identification number, file size, file name, time stamp, number of superblocks in the file, checksum of the superblock, etc. Wherein, the application file identification number is the anti-collision hash value of the file content, and can also be used as a verification of file integrity during the file acquisition process.

第二步，管理服务器接收该PutFileReq请求，存储该应用文件的元数据，并且要确认整个大数据重复删除存储系统中有足够的空间来存储该文件。这里，管理服务器只需保持文件级的元数据管理，而不是超块粒度，使得管理服务器可以负责更大容量的数据管理，整个系统具有更高的扩展性。In the second step, the management server receives the PutFileReq request, stores the metadata of the application file, and confirms that there is enough space in the entire big data deduplication storage system to store the file. Here, the management server only needs to maintain file-level metadata management instead of super-block granularity, so that the management server can be responsible for data management of a larger capacity, and the entire system has higher scalability.

管理服务器还执行应用感知路由决策，为该应用文件选择一组用于存储该文件超块的重删节点列表。然后，管理服务器给客户端发回一个PutFileResp响应。在PutFileResp消息中，管理服务器向客户端回应文件识别号和该文件对应的重删节点列表。The management server also performs application-aware routing decisions, and selects a set of deduplication node lists for storing the superblock of the file for the application file. Then, the management server sends a PutFileResp response back to the client. In the PutFileResp message, the management server responds to the client with the file identification number and the list of deduplication nodes corresponding to the file.

第三步，客户端接收PutFileResp消息后，对于该文件中的每一个超块，利用相似性感知路由方法，分别向k个候选重删节点发出k个LookupSCReq查询请求，目的是为了在这些候选的重删节点中为该超块的代表指纹查找到应用感知相似性索引。图4中示意给出了三个重删节点，即重删节点1、重删节点2和重删节点3，其中只是将重删节点2和重删节点3作为候选的重删节点，因此分别向重删节点2和重删节点3发出了LookupSCReq_1请求，其中_1用以区分不同的超块，可以视之为该文件中第1个超块。接收到该请求后，重删节点2和重删节点3分别进行了相似索引查找，即图4中的相似索引查找，主要是通过基于相似性感知路由方法，各候选重删节点为该超块确定对应的加权相似值。In the third step, after receiving the PutFileResp message, the client sends k LookupSCReq query requests to k candidate deduplication nodes for each superblock in the file using the similarity-aware routing method. The application-aware similarity index is found for the representative fingerprint of the superblock in the deduplication node. Figure 4 schematically shows three deduplication nodes, that is, deduplication node 1, deduplication node 2, and deduplication node 3, in which only deduplication node 2 and deduplication node 3 are used as candidate deduplication nodes, so respectively A LookupSCReq_1 request is sent to deduplication node 2 and deduplication node 3, where _1 is used to distinguish different superblocks, which can be regarded as the first superblock in the file. After receiving the request, deduplication node 2 and deduplication node 3 respectively perform similarity index search, that is, the similarity index search in Figure 4, mainly through the similarity-aware routing method, each candidate deduplication node is the superblock Determine the corresponding weighted similarity value.

第四步，重删节点向客户端回应LookupSCResp消息，该消息包括候选重删节点向客户端回应的与该超块对应的加权相似值。图4中重删节点2和重删节点3回应了LookupSCResp_1消息。In the fourth step, the deduplication node responds to the client with a LookupSCResp message, which includes the weighted similarity value corresponding to the superblock that the candidate deduplication node responds to the client. In Figure 4, deduplication node 2 and deduplication node 3 responded to the LookupSCResp_1 message.

第五步，客户端接收这些LookupSCResp消息后，根据加权相似值的大小，选择其中加权相似值最大的一个候选重删节点作为目标重删节点，用以存储该超块，并且再通过PutSCReq消息通知管理服务器该重删节点的识别号，管理服务器回应PutSCResp。图4中针对的该文件中的第1个超块而言，分别用PutSCReq_1和PutSCResp_1加以表示。Step 5: After receiving these LookupSCResp messages, the client selects a candidate deduplication node with the largest weighted similarity value as the target deduplication node according to the size of the weighted similarity value to store the super block, and then notifies through the PutSCReq message The management server should deduplicate the identification number of the node, and the management server responds to PutSCResp. The first superblock in the file in FIG. 4 is represented by PutSCReq_1 and PutSCResp_1 respectively.

第六步，客户端确定目标重删节点后，向该目标重删节点发出LookupChunksSCReq请求信息，并以批量方式发出该超块的全部块指纹到该目标重删节点，用以确认其中每一个数据块是否重复，目标重删节点进行指纹查找后，向客户端返回LookupChunksSCResp消息。图4中对应为LookupChunksSCReq_1请求消息，用以对第1个超块的处理，目标重删节点为重删节点2，然后重删节点2执行ChunkFPsLookup操作，进行数据块指纹查找，之后，目标重删节点2向客户端回应LookupChunksSCResp_1消息，其中包括一个该超块中的唯一数据块的列表。Step 6: After the client determines the target deduplication node, it sends a LookupChunksSCReq request message to the target deduplication node, and sends all the block fingerprints of the super block to the target deduplication node in batches to confirm each data in it Whether the block is repeated, the target deduplication node returns a LookupChunksSCResp message to the client after performing fingerprint search. Figure 4 corresponds to the LookupChunksSCReq_1 request message, which is used to process the first super block. The target deduplication node is deduplication node 2, and then deduplication node 2 executes the ChunkFPsLookup operation to search for data block fingerprints. After that, the target deduplication node Point 2 responds to the client with a LookupChunksSCResp_1 message, which includes a list of unique data chunks in the superchunk.

第七步，客户端向目标重删节点发出UniqueChunksSC消息，并以批量方式将该超块中的唯一数据块发送到该目标重删节点，该目标重删节点接收后将这些唯一数据块进行存储，然后向管理服务器返回SCAck确认消息。图4中对应为UniqueChunksSC_1消息和SCAck_1消息，表示对第1超块的操作。Step 7: The client sends a UniqueChunksSC message to the target deduplication node, and sends the unique data chunks in the superblock to the target deduplication node in batches, and the target deduplication node stores these unique data chunks after receiving them , and then return a SCAck confirmation message to the management server. In Fig. 4, the messages correspond to UniqueChunksSC_1 and SCAck_1, indicating the operation on the first superblock.

接下来，对该文件中的第2超块重复上述同样的操作，直至整个应用文件中的所有超块在各重删节点存储完毕。Next, repeat the same operation above for the second superblock in the file until all superblocks in the entire application file are stored in each deduplication node.

图5显示了应用文件在该大数据重复删除存储系统中获取方法。图5中，第一步，由客户端向管理服务器发起GetFileReq请求，希望请求获取一个文件。管理服务器响应该GetFileReq请求，通过查询该应用文件的元数据后，向客户端反馈GetFileResp消息，该GetFileResp消息中包含被请求应用文件中的超块列表，以及从超块到对应存储的重删节点的映射。第二步，客户端收到GetFileResp消息后，向该应用文件所包含的超块对应的重删节点发出GetSuperChunk消息，请求这些重删节点获取这些超块。图5中示意表示该应用文件包含两个超块，因此客户端分别向存储有该两个超块的重删节点1和重删节点2分别发出GetSuperChunk_2和GetSuperChunk_1消息。第三步，重删节点接收GetSuperChunk消息，从其容器中获取对应的超块，然后向客户端返回超块。图5中，重删节点1获取SuperChunk_2后向客户端返回该超块SuperChunk_2，重删节点2获取SuperChunk_1后向客户端返回该超块SuperChunk_1。第四步，客户端接收这些超块后，再利用超块的验证码和文件识别号来确认数据的完整性，并完成对该文件的获取。Figure 5 shows the method for acquiring application files in the big data deduplication storage system. In Figure 5, in the first step, the client initiates a GetFileReq request to the management server, hoping to obtain a file. The management server responds to the GetFileReq request, and after querying the metadata of the application file, feeds back the GetFileResp message to the client. The GetFileResp message contains the superblock list in the requested application file, and the deduplication nodes from the superblock to the corresponding storage mapping. In the second step, after receiving the GetFileResp message, the client sends a GetSuperChunk message to the deduplication nodes corresponding to the superblocks included in the application file, requesting these deduplication nodes to obtain the superchunks. FIG. 5 schematically shows that the application file contains two superblocks, so the client sends GetSuperChunk_2 and GetSuperChunk_1 messages to deduplication node 1 and deduplication node 2 respectively, which store the two superblocks. In the third step, the deduplication node receives the GetSuperChunk message, obtains the corresponding superchunk from its container, and then returns the superchunk to the client. In Figure 5, deduplication node 1 obtains SuperChunk_2 and returns the SuperChunk_2 to the client, and deduplication node 2 obtains SuperChunk_1 and returns the superchunk SuperChunk_1 to the client. In the fourth step, after receiving these super blocks, the client uses the verification code of the super blocks and the file identification number to confirm the integrity of the data, and completes the acquisition of the file.

通过本发明提供的应用感知的大数据重复删除存储系统及其应用方法，包括基于计算机网络互联的客户端、管理服务器和重删节点，客户端与管理服务器之间通过应用感知路由方法为应用文件存储确定重删节点列表，客户端与重删节点之间通过相似感知路由方法从重删节点列表中确定目标重删节点，用于存储应用文件中的超块。这种两层路由方法能够有效降低网络的通信负荷，通过该系统对应用文件进行存储，能够使得具有相同类型的应用文件存储到相同的重删节点上，有利于提高重删率，以及在每个重删节点上的存储的吞吐率，增强分布式存储系统的可扩展性。The application-aware large data deduplication storage system and its application method provided by the present invention include a client, a management server, and a deduplication node based on computer network interconnection, and the application file between the client and the management server is an application-aware routing method. The storage determines the deduplication node list, and the client and the deduplication node determine the target deduplication node from the deduplication node list through the similarity-aware routing method, which is used to store the superblock in the application file. This two-layer routing method can effectively reduce the communication load of the network. By storing application files through this system, application files of the same type can be stored on the same deduplication node, which is conducive to improving the deduplication rate and The storage throughput on each deduplication node enhances the scalability of the distributed storage system.

以上所述仅为本发明的实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构变换，或直接或间接运用在其他相关的技术领域，均包括在本发明的专利保护范围内。The above is only an embodiment of the present invention, and does not limit the patent scope of the present invention. All equivalent structural transformations made by using the description of the present invention and the contents of the accompanying drawings, or directly or indirectly used in other related technical fields, all include Within the scope of patent protection of the present invention.