Movatterモバイル変換


[0]ホーム

URL:


CN119628904A - Data deduplication system and method based on secure IP tunnel between networks - Google Patents

Data deduplication system and method based on secure IP tunnel between networks
Download PDF

Info

Publication number
CN119628904A
CN119628904ACN202411733168.3ACN202411733168ACN119628904ACN 119628904 ACN119628904 ACN 119628904ACN 202411733168 ACN202411733168 ACN 202411733168ACN 119628904 ACN119628904 ACN 119628904A
Authority
CN
China
Prior art keywords
data
local area
area network
storage server
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411733168.3A
Other languages
Chinese (zh)
Inventor
郑嘉琦
郭佳怡
陈贵海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing UniversityfiledCriticalNanjing University
Priority to CN202411733168.3ApriorityCriticalpatent/CN119628904A/en
Publication of CN119628904ApublicationCriticalpatent/CN119628904A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提出了一种基于网间安全IP隧道的数据去重系统及方法,包括:局域网和数据中心之间基于可编程交换机建立的网间安全IP隧道,用于对离开局域网和数据中心的数据去重相关的数据包进行整个包的加密并对进入局域网和数据中心的数据去重相关的数据包进行整个包的解密;存储服务器对应的边缘交换机,用于完成数据去重和加密的在网计算;数据中心网络内部的中心化控制器,负责与局域网内部的控制器进行密钥协商。本发明能够支持更细的去重颗粒度,从而能够在引入可接受的额外存储成本和降低服务器CPU消耗的同时,显著提高重复数据删除的效率。

The present invention proposes a data deduplication system and method based on an inter-network secure IP tunnel, comprising: an inter-network secure IP tunnel established between a local area network and a data center based on a programmable switch, used to encrypt the entire data packet related to data deduplication leaving the local area network and the data center and to decrypt the entire data packet related to data deduplication entering the local area network and the data center; an edge switch corresponding to a storage server, used to complete the on-line calculation of data deduplication and encryption; a centralized controller inside the data center network, responsible for key negotiation with the controller inside the local area network. The present invention can support finer deduplication granularity, thereby significantly improving the efficiency of deduplication while introducing acceptable additional storage costs and reducing server CPU consumption.

Description

Data deduplication system and method based on inter-network secure IP tunnel
Technical Field
The invention belongs to the field of network systems, and particularly relates to a data deduplication system and method based on an inter-network secure IP tunnel.
Background
With the continuous development of internet technology and the continuous increase of the total number of internet users, the global data volume is also greatly increased. In this case, how to manage data storage economically and efficiently has become one of the most challenging important tasks in mass storage systems in the big data age. At the same time, limited by the storage capabilities of a single device, individuals and organizations often need to resort to cloud storage service providers to enable low cost storage, transmission and backup of ever-increasing data. In this case, in order to further improve the efficiency of data storage and reduce the cost of data storage, common cloud service providers typically avoid multiple storages of duplicate data as much as possible by adopting a data deduplication technology, so as to reduce the overhead of data storage and also reduce the upload bandwidth of users.
Meanwhile, in order to avoid private data disclosure of users, the real plaintext data is often required to be encrypted and then transmitted in consideration of a large number of malicious attackers in the real network environment. However, when different users upload the same data, the private keys adopted by the respective encryption are different, so that the same plaintext finally becomes different ciphertext, and the data repetition cannot be identified by the data storage server.
In the current network research, there are two existing conventional solutions, namely a method based on converged encryption (Convergent Encryption) and a method based on SSL or TLS. For the data plaintext M, the method based on convergent encryption uses a hash function to calculate a hash value H (M) of M, and encrypts the plaintext with the hash value as an encryption key to obtain E (H (M), M). Meanwhile, in order to ensure that the user can still decrypt the data when downloading the file, the data center needs to additionally store the hash value H (M) serving as the encryption key. For security reasons, the hash value H (M) needs to be encrypted by the user's respective private key to obtain E (Ka, H (M)). The SSL or TLS-based method directly encrypts data by SSL or TLS protocol during transmission, and decrypts data by SSL or TLS protocol during reception by the data storage server, so as to perform subsequent operations. For the method of convergent encryption, since the encryption result of the hash key is required to be additionally stored, the block size of the data deduplication can basically be only at the file level or a relatively large block (KB level), otherwise, when the block size is very small, the additionally introduced storage overhead cannot be ignored, so that the redundant information in the file cannot be fully utilized for deduplication, and the deduplication effect is poor. For the SSL or TLS method, since the data server uses the plaintext of the data when performing the repeated data determination, if the private data of the user is directly stored in the plaintext form, there is a greater potential safety hazard, so the data still needs to be encrypted before actually storing the data each time, and the data needs to be decrypted each time when the data is downloaded each time, thereby increasing the CPU overhead of the server.
Therefore, it is a problem to be solved by those skilled in the art to provide a system/method that can perform deduplication with a smaller chunk size and offload the complex encryption operation of the server into the network computing, and the system/method can greatly increase the deduplication efficiency of the data storage system and significantly reduce the CPU overhead of the server.
Disclosure of Invention
Aiming at the defect that the existing data file transmission data deduplication of a storage server cannot simultaneously consider the data deduplication storage efficiency and the CPU overhead of the storage server, the invention provides a data deduplication system and a data deduplication method based on an inter-network secure IP tunnel, which can ensure the safety and reliability of data packet transmission between a local area network and a data center network by establishing an encrypted inter-network IP tunnel, thereby ensuring the privacy of user privacy data in an untrusted transmission environment with malicious nodes, simultaneously enabling the inside of the data center network to obtain the plaintext of the user data, facilitating the subsequent analysis and processing of the data without introducing additional storage overhead, and unloading the encryption and decryption operation of the storage data to an edge programmable switch, thereby simultaneously realizing the great improvement of the deduplication efficiency and the improvement of the CPU utilization rate of the server, and saving the overhead and the cost of the data center storage server.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the invention provides a data deduplication system based on an inter-network secure IP tunnel, which comprises the inter-network secure IP tunnel, an edge switch and a centralized controller;
The inter-network security IP tunnel is established based on a programmable switch and is positioned between a local area network and a data center, and is used for encrypting the whole packet of the data packet which is related to the data deduplication and leaves the local area network and the data center and decrypting the whole packet of the data packet which is related to the data deduplication and enters the local area network and the data center;
The edge switch corresponds to a storage server of the data center and is used for finishing the check of whether the data block of the uploading file is stored by the storage server, and performing AES encryption operation on the non-stored plaintext data block and AES decryption operation on the encrypted data block of the downloading file;
The centralized controller is positioned in the data center and is responsible for carrying out key negotiation with the controller in the local area network, if a new file data block is stored during file uploading, a new table entry is added to the hash-index table of the corresponding storage server, and if the stored file data block is not used by other files any more during file deleting, an old table entry is deleted to the hash-index table of the corresponding storage server.
In a second aspect, the present invention provides a local area network client file uploading method using the data deduplication system according to the first aspect, including the following steps:
when uploading a file, a local area network client calculates hash values of corresponding data blocks respectively, and sequentially assembles the hash values into a plurality of inquiry data packets, wherein each data packet comprises hash values of m data blocks and is sent to a corresponding storage server through an inter-network secure IP tunnel;
Finishing the inquiry function when each inquiry data packet sent by the local area network client passes through the edge switch corresponding to the storage server, and recording index information of the data block on the storage server if the data block is stored;
Thirdly, integrating the received multiple query result data packets by the storage server to return a complete query result data packet;
Step four, the local area network client receives the complete query result data packet and then sends the missing data block to the storage server;
step five, calculating a hash value of a plaintext after the corresponding missing data block passes through an edge switch, encrypting the plaintext by AES, and obtaining an encrypted data block of a ciphertext and transmitting the encrypted data block to a storage server;
And step six, the storage server receives the data block encrypted by the edge switch, updates a data block counter maintained by the storage server, and sends the stored hash value and index to the centralized controller so as to update a hash-index table of the corresponding edge switch.
Optionally, in the third step, the complete query result data packet includes a 01 bit string, and the corresponding data block is indicated by the corresponding position 0 not to be stored, and the corresponding position 1 indicates that the corresponding data block is already stored on the server.
Optionally, in the fourth step, after receiving the complete query result data packet, the local area network client sends the missing data block corresponding to the position 0 to the storage server.
In a third aspect, the present invention provides a method for downloading a local area network client file using the data deduplication system according to the first aspect, including the steps of:
the first step, a local area network client sends a file ID to be downloaded;
The second step, the storage server sends the encrypted ciphertext to the local area network client through the internet security IP tunnel according to the index list of the corresponding file;
Thirdly, decrypting the encrypted ciphertext through AES to obtain the original plaintext of the data when the encrypted ciphertext passes through the corresponding edge switch.
In a fourth aspect, the present invention provides a method for deleting a local area network client file using the data deduplication system according to the first aspect, including the following steps:
the first step, a local area network client sends a file ID to be deleted;
and step two, the storage server sequentially reduces the corresponding data block counter by 1 according to the index list of the corresponding file, and if the corresponding counter is reduced to 0, the centralized controller is informed to update the hash-index table in the edge switch.
The invention has the beneficial effects that the system and the method for establishing the inter-network security IP tunnel and carrying out data deduplication through the network programmable switch can use finer granularity of data blocks to carry out deduplication, greatly improve the efficiency of data deduplication, thereby better identifying the information redundancy among files uploaded by a plurality of users, reducing the storage cost of a storage server and reducing the cost. And the programmable switch can analyze and process with the plaintext data by establishing the secure IP tunnel, so that other extra stored data are not needed to be introduced like the existing convergent encryption method. The invention also unloads the hash-index table lookup and AES encryption and decryption operations to the edge programmable switch, so that CPU overhead of the corresponding storage server is saved, and the CPU resource utilization rate can be improved.
Drawings
Fig. 1 is a system framework diagram of a data deduplication system based on an inter-network secure IP tunnel.
Detailed Description
The invention will now be described in further detail with reference to the accompanying drawings.
Example 1
The embodiment provides a data deduplication system based on an inter-network secure IP tunnel, and fig. 1 is a schematic diagram of an overall framework of the system. As shown in the figure, the system mainly comprises two parts of a local area network client and a data center, wherein a sender comprises a plurality of clients and a port programmable switch, and a receiver comprises a core programmable switch, a convergence programmable switch, an edge programmable switch and a plurality of storage servers. The local area network client needs to upload, download and delete files, an IP secure tunnel is established between a port switch of the local area network and a core switch of a data center, and an edge switch performs hash index lookup and AES algorithm operations.
The data deduplication system based on the inter-network secure IP tunnel specifically comprises an inter-network secure IP tunnel established between a local area network and a data center based on a programmable switch, an edge switch corresponding to a storage server and used for completing on-network calculation of data deduplication and encryption, and a centralized controller in the data center network.
The inter-network security IP tunnel module encrypts the whole packet of the data packet related to the data deduplication leaving the local area network or the data center and decrypts the whole packet of the data packet related to the data deduplication entering the local area network or the data center.
The edge switch completes the check of whether the corresponding data block of the uploading file is already stored by the storage server, the AES encryption operation of the plain data block that is not stored, and the decryption operation of the encrypted data block of the downloading file.
The centralized controller is responsible for carrying out key negotiation with a controller in the local area network, when a file is uploaded, if a new file block is stored, a new table entry is required to be added to the hash-index table of the corresponding storage server, and when the file is deleted, if the stored file block is not used by other files any more, an old table entry is required to be deleted to the hash-index table of the corresponding storage server.
The system establishes a safe IP tunnel between the local area network and the endpoint programmable exchanger of the data center, the endpoint exchanger decrypts the ciphertext data packet received from the external network and encrypts the plaintext data packet sent to the external network, thereby ensuring that the data packet is in a plaintext form in the local area network and the data center, the main repeated data deduplication function is unloaded to the edge programmable exchanger in the data center network, the local area network client calculates the corresponding hash code and sends the query data packet before sending the plaintext actual data, and after the data packet reaches the edge exchanger of the data center, whether the corresponding entry exists in the hash-index table of the programmable exchanger is queried, thereby judging whether the data block is stored on the corresponding storage server. Only when the data block is found to be missing, the client in the local area network can upload corresponding data subsequently, and AES encryption is carried out on the data block through a programmable switch in the data center network. When the client in the local area network needs to download the file, the storage ciphertext sent by the storage server is decrypted by an edge programmable switch in the data center network. Therefore, the system can support finer de-duplication granularity, so that the efficiency of de-duplication can be remarkably improved while introducing acceptable additional storage cost and reducing the consumption of a server CPU.
Example two
The embodiment provides a method for uploading, downloading and deleting files by adopting the data deduplication system based on the internet security IP tunnel of the embodiment I, which specifically comprises the following steps.
For file upload operations:
Firstly, when uploading a file, a local area network client calculates hash values of corresponding data blocks respectively, and sequentially assembles the hash values into a plurality of inquiry data packets, wherein each data packet comprises hash values of m data blocks and is sent to a corresponding storage server through an IP secure tunnel;
Secondly, finishing the inquiry function when each inquiry data packet sent by the local area network client passes through the edge switch corresponding to the storage server, and recording index information of the data block on the storage server if the data block is stored;
Thirdly, integrating the received data packets of the query results by the storage server to return a complete data packet, wherein the data packet contains 01 bit strings, and the corresponding data block is not stored by using the corresponding position 0, and the corresponding position 1 indicates that the corresponding data block is already stored on the server;
fourthly, after receiving the complete query result data packet, the local area network client sends the missing data block corresponding to the position 0 to the storage server;
Fifthly, calculating a hash value of the plaintext after the corresponding missing data block passes through an edge switch, encrypting the plaintext by AES, and transmitting the encrypted data block to a storage server;
And sixthly, the storage server receives the data block encrypted by the edge switch, updates a data block counter maintained by the storage server, and sends the stored hash value and index to a controller of the data center, so that a hash-index table of the corresponding programmable switch is updated.
For file download operations:
the first step, a local area network client sends a file ID to be downloaded;
the second step, the storage server sends the encrypted ciphertext to the local area network client through the IP secure tunnel according to the index list of the corresponding file;
Thirdly, decrypting the encrypted ciphertext through AES to obtain the original plaintext of the data when the encrypted ciphertext passes through the corresponding edge switch.
For file delete operations:
the first step, a local area network client sends a file ID to be deleted;
And step two, the storage server sequentially reduces the corresponding data block counter by 1 according to the index list of the corresponding file, and if the corresponding counter is reduced to 0, the storage server informs the controller to update the hash-index table in the edge switch.
The system and the method are innovative in that the system and the method finish the internet IP secure tunnel through the internet programmable switch for the first time, and unload the hash index table for data deduplication, the plaintext AES encryption operation and the ciphertext AES decryption operation to the internal programmable switch of the data center, so that the encrypted transmission of the data in an external unsafe network environment can be ensured, and the plaintext analysis processing in the local area network and the data center can be performed more efficiently by the smaller data block size without introducing additional storage overhead. Meanwhile, the invention can offload the calculation task of the storage server to the corresponding edge switch, and reduce the CPU overhead of the storage server by utilizing the linear processing speed of the programmable switch.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the invention without departing from the principles thereof are intended to be within the scope of the invention as set forth in the following claims.

Claims (6)

CN202411733168.3A2024-11-282024-11-28 Data deduplication system and method based on secure IP tunnel between networksPendingCN119628904A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411733168.3ACN119628904A (en)2024-11-282024-11-28 Data deduplication system and method based on secure IP tunnel between networks

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411733168.3ACN119628904A (en)2024-11-282024-11-28 Data deduplication system and method based on secure IP tunnel between networks

Publications (1)

Publication NumberPublication Date
CN119628904Atrue CN119628904A (en)2025-03-14

Family

ID=94897782

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411733168.3APendingCN119628904A (en)2024-11-282024-11-28 Data deduplication system and method based on secure IP tunnel between networks

Country Status (1)

CountryLink
CN (1)CN119628904A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20120150826A1 (en)*2010-12-142012-06-14Commvault Systems, Inc.Distributed deduplicated storage system
US20140129830A1 (en)*2012-11-072014-05-08Wolfgang RaudaschlProcess for Storing Data on a Central Server
CN106326308A (en)*2015-07-032017-01-11华中科技大学Intranet duplicated data deletion method and system based on SDN (Software Defined Network)
CN106471477A (en)*2014-07-022017-03-01国际商业机器公司The multi-tenant based on hash in deduplication system
US9678968B1 (en)*2010-05-032017-06-13Panzura, Inc.Deleting a file from a distributed filesystem
CN108377237A (en)*2018-02-052018-08-07江苏大学The data deduplication system and its data duplicate removal method with ownership management for the storage of high in the clouds ciphertext
CN113037732A (en)*2021-02-262021-06-25南京大学Multi-user security encryption de-duplication method based on wide area network scene
CN114270331A (en)*2019-08-192022-04-01国际商业机器公司Opaque encryption for data deduplication
CN114518850A (en)*2022-02-232022-05-20云链网科技(广东)有限公司Safe re-deletion storage system with re-deletion before encryption based on trusted execution protection
CN115225409A (en)*2022-08-312022-10-21成都泛联智存科技有限公司Cloud data safety deduplication method based on multi-backup joint verification
CN117880200A (en)*2022-10-102024-04-12华为技术有限公司 Data transmission method, device and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9678968B1 (en)*2010-05-032017-06-13Panzura, Inc.Deleting a file from a distributed filesystem
US20120150826A1 (en)*2010-12-142012-06-14Commvault Systems, Inc.Distributed deduplicated storage system
US20140129830A1 (en)*2012-11-072014-05-08Wolfgang RaudaschlProcess for Storing Data on a Central Server
CN106471477A (en)*2014-07-022017-03-01国际商业机器公司The multi-tenant based on hash in deduplication system
CN106326308A (en)*2015-07-032017-01-11华中科技大学Intranet duplicated data deletion method and system based on SDN (Software Defined Network)
CN108377237A (en)*2018-02-052018-08-07江苏大学The data deduplication system and its data duplicate removal method with ownership management for the storage of high in the clouds ciphertext
CN114270331A (en)*2019-08-192022-04-01国际商业机器公司Opaque encryption for data deduplication
CN113037732A (en)*2021-02-262021-06-25南京大学Multi-user security encryption de-duplication method based on wide area network scene
CN114518850A (en)*2022-02-232022-05-20云链网科技(广东)有限公司Safe re-deletion storage system with re-deletion before encryption based on trusted execution protection
CN115225409A (en)*2022-08-312022-10-21成都泛联智存科技有限公司Cloud data safety deduplication method based on multi-backup joint verification
CN117880200A (en)*2022-10-102024-04-12华为技术有限公司 Data transmission method, device and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RONG GU ET AL.: "Fluid-Shuttle: Efficient Cloud Data Transmission Based on Serverless Computing Compression", IEEE, 10 October 2024 (2024-10-10)*
许悦玥: "基于联盟链的可靠边缘计算任务卸载方法", 电子学报, 13 March 2024 (2024-03-13)*

Similar Documents

PublicationPublication DateTitle
US11038854B2 (en)Terminating SSL connections without locally-accessible private keys
US9195851B1 (en)Offloading encryption to the client
CN103609060B (en)Method for carrying out flexible data protection using the data sink of dynamic authorization in content network or in cloud storage service and content delivery services
CN107430668B (en)Secure distributed backup for personal devices and cloud data
CN109995505B (en)Data security duplicate removal system and method in fog computing environment and cloud storage platform
US8156168B2 (en)Method and system for data security
CN104023027B (en)High in the clouds data definitiveness delet method based on ciphertext sampling burst
WO2013006296A1 (en)Methods and apparatus for secure data sharing
CN110175169B (en)Encrypted data deduplication method, system and related device
CN114466015B (en)Data storage system and method based on multi-cloud architecture
US11652642B2 (en)Digital data locker system providing enhanced security and protection for data storage and retrieval
Alsmirat et al.A security framework for cloud-based video surveillance system
CN107769918B (en) A secure multi-copy association deletion method for cloud data
US20200052901A1 (en)Secure audit scheme in a distributed data storage system
CN112866299B (en) Device and method for deduplication and sharing of encrypted data in mobile edge computing network
Meye et al.A secure two-phase data deduplication scheme
CN112733189A (en)System and method for realizing file storage server side encryption
CN106027555A (en)Method and system for improving network security of content delivery network by employing SDN (Software Defined Network) technology
CN119628904A (en) Data deduplication system and method based on secure IP tunnel between networks
CN113037732B (en)Multi-user security encryption de-duplication method based on wide area network scene
CN115842833A (en)Processing method, device and system for super-fusion virtual storage
JP2023535011A (en) quantum streaming
CN114827031A (en)Routing table security query method based on secure multi-party computation
CN119011125B (en) An encryption deduplication method to resist convergence key leakage
CN116915501B (en)Internet of things information security management method and system

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp