Movatterモバイル変換


[0]ホーム

URL:


CN107480267A - A kind of method that file difference synchronizing speed is improved using locality - Google Patents

A kind of method that file difference synchronizing speed is improved using locality
Download PDF

Info

Publication number
CN107480267A
CN107480267ACN201710708408.8ACN201710708408ACN107480267ACN 107480267 ACN107480267 ACN 107480267ACN 201710708408 ACN201710708408 ACN 201710708408ACN 107480267 ACN107480267 ACN 107480267A
Authority
CN
China
Prior art keywords
file
computer
fileinfo
files
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710708408.8A
Other languages
Chinese (zh)
Inventor
李洋
李振华
郭振格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUXI QINGHUA INFORMATION SCIENCE AND TECHNOLOGY NATIONAL LABORATORY INTERNET OF THINGS TECHNOLOGY CENTER
Original Assignee
WUXI QINGHUA INFORMATION SCIENCE AND TECHNOLOGY NATIONAL LABORATORY INTERNET OF THINGS TECHNOLOGY CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUXI QINGHUA INFORMATION SCIENCE AND TECHNOLOGY NATIONAL LABORATORY INTERNET OF THINGS TECHNOLOGY CENTERfiledCriticalWUXI QINGHUA INFORMATION SCIENCE AND TECHNOLOGY NATIONAL LABORATORY INTERNET OF THINGS TECHNOLOGY CENTER
Priority to CN201710708408.8ApriorityCriticalpatent/CN107480267A/en
Publication of CN107480267ApublicationCriticalpatent/CN107480267A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The present invention discloses a kind of method that file difference synchronizing speed is improved using locality, the local file B piecemeals that a computer b is stored in this method, check code is calculated respectively to each blocks of files, local file B fileinfo and the verification code table being calculated are sent to computer a, computer a stores the to be synchronized file A similar to B, computer a is when verification code division layer is contrasted from outside to inside, it is that the blocks of files adjacent with the blocks of files of matching is also likely to be that this feature of matching accelerates matching process using file locality, and then determine to match failed fileinfo, blocks of files corresponding to these information is sent to computer b, the blocks of files received and file B are carried out file splicing to realize file synchronization by computer b.The present invention can save a large amount of files for calculating the time, being adapted to synchronous comparison in difference to concentrate in the case of file locality is preferable.

Description

A kind of method that file difference synchronizing speed is improved using locality
Technical field
The present invention relates to Internet technical field, more particularly to a kind of improve file difference synchronizing speed using localityMethod.
Background technology
Cloud storage has become a kind of trend of following storage development.At present on the one hand cloud storage service can be internetUser provides the storage service of Large Copacity, on the other hand may also be used for carrying out the back-end data storage of the Internet, applications.AlthoughCloud storage service is convenient and practical, but the problem of being experienced there is influence Internet user that remain unchanged at present, and one of them is commonProblem is exactly time-consuming longer when file synchronization is uploaded into cloud service, has a strong impact on Consumer's Experience.In order to solve the above problems,It was found that user, when carrying out file synchronization using cloud storage service, the file to be synchronized that user is uploaded is compared to high in the clouds textThe difference locality overwhelming majority of part is fine, and domestic and international well-known cloud service such as Baidu's cloud, Dropbox etc. pc clientThe synchronous method of file differential coding is all employed to carry out file synchronization to reduce flow consumption, but this method file is sameLeg speed degree is slow, and the speed-optimization not being compared.Wherein, Dropbox is a free nets file synchronization workTool, it is the online storage service of Dropbox companies operation, realizes the file synchronization on internet by cloud computing, user can be withStore simultaneously shared file and file.
The content of the invention
It is an object of the invention to by it is a kind of using locality improve file difference synchronizing speed method, come solve withThe problem of upper background section is mentioned.
To use following technical scheme up to this purpose, the present invention:
A kind of method that file difference synchronizing speed is improved using locality, it comprises the following steps:
S101, computer b read original document, by the local file B piecemeals of storage, obtain each of the local file BBlocks of files;
S102, the computer b calculate check code to each blocks of files of the local file B and generate check code respectivelyTable, the fileinfo of the local file B and verification code table are sent to computer a;
S103, the computer a are according to the fileinfo and verification code table of the local file B, foundation adjacent file blockLocality matched with file A to be synchronized, and the obtained failed fileinfo of matching is fed back into the computerb;
S104, the computer b receive the failed fileinfo of the matching, and will match failed text with describedBlocks of files corresponding to part information and local file B progress file splicings are synchronous to complete.
Especially, computer b calculates verification respectively to each blocks of files of the local file B described in the step S102Code simultaneously generates verification code table, specifically includes:
The computer b each blocks of files of the local file B calculated respectively 32 rolling cryptographic Hash and 128Cryptographic Hash, generation verification code table.
Especially, the step S102 is specifically included:
The computer b each blocks of files of the local file B calculated respectively 32 rolling cryptographic Hash and 128Cryptographic Hash simultaneously stores to 16 bit check code Hash tables and verifies code table, by the fileinfo of the local file B and verification code tableSend to computer a.
Especially, fileinfo described in the step S102 include but is not limited to file size, file block size andFile block number.
Especially, the step S103 includes:Fileinfos and verification of the computer a according to the local file BCode table, traversal contrast check code are treated synchronous documents A and matched, locality and file A to be synchronized according to adjacent file blockMatched, and the failed fileinfo of obtained matching is fed back into the computer b.
Especially, the step S103 is specifically included:The computer a treats synchronous documents A according to file block size, fromMinimum blocks of files corresponding part starts with file block number, successively the current check part of Backwards selection, and calculates current verification32 cryptographic Hash in partial check code, 32 cryptographic Hash are searched in the verification code table, are calculated after the match is successful current128 cryptographic Hash in the check code of check part, 128 cryptographic Hash are searched in the verification code table, if matching twiceSuccess, then the next check part adjacent with current check part need not calculate 32 cryptographic Hash, directly calculate 128 HashIt is worth and searches 128 cryptographic Hash in the verification code table, accelerates matching so as to realize;If it fails to match for current check part,The fileinfo of current check part respective file block is added to and matches failed fileinfo, until to described to be synchronizedAfter file A is searched, the failed fileinfo of obtained matching is fed back into the computer b.
Especially, if the match is successful twice in the step S103, the next time verification adjacent with current check partPart need not calculate 32 cryptographic Hash, directly calculate 128 cryptographic Hash and search 128 Hash in the verification code tableValue, is specifically included:
If the match is successful twice, the next check part adjacent with current check part need not calculate 32 HashValue, directly calculate 128 cryptographic Hash and search 128 cryptographic Hash in the verification code table, and current check part is correspondingBlocks of files file block number and blocks of files positional information be added in matching files information.
Especially, the step S104 is specifically included:The computer b receives the failed fileinfo of the matching,And the corresponding blocks of files of failed fileinfo and the local file B will be matched with described according to the matching files informationFile splicing is carried out to complete synchronization.
A computer b is stored in the method proposed by the present invention that file difference synchronizing speed is improved using localityLocal file B piecemeals, check code is calculated respectively to each blocks of files, by local file B fileinfo and is calculatedVerification code table is sent to computer a, and computer a stores the to be synchronized file A similar to B, and computer a will verify code division layerIt is that the blocks of files adjacent with the blocks of files of matching is also likely to be matching using file locality when being contrasted from outside to insideThis feature accelerates matching process, and then determines to match failed fileinfo, and blocks of files corresponding to these information is sent outComputer b is delivered to, the blocks of files received and file B are carried out file splicing to realize file synchronization by computer b.The present invention existsA large amount of files for calculating the time, being adapted to synchronous comparison in difference to concentrate can be saved in the case of file locality is preferable.
Brief description of the drawings
Fig. 1 is the method flow diagram that the present invention improves file difference synchronizing speed using locality;
Fig. 2 is the algorithm flow chart for the method that the present invention improves file difference synchronizing speed using locality.
Embodiment
For the ease of understanding the present invention, the present invention is described more fully below with reference to relevant drawings.In accompanying drawingGive presently preferred embodiments of the present invention.But the present invention can realize in many different forms, however it is not limited to this paper institutesThe embodiment of description.On the contrary, the purpose for providing these embodiments is made to the more thorough of the disclosure understandingComprehensively.Unless otherwise defined, the skill of technical field of all of technologies and scientific terms used here by the article with belonging to the present inventionThe implication that art personnel are generally understood that is identical.It is specific that term used in the description of the invention herein is intended merely to descriptionEmbodiment purpose, it is not intended that in limitation the present invention.Term as used herein " and/or " include one or more correlationsListed Items arbitrary and all combination.
It refer to shown in Fig. 1, Fig. 1 is the method stream provided by the invention that file difference synchronizing speed is improved using localityCheng Tu.
The method for being improved file difference synchronizing speed in the present embodiment using locality is specifically comprised the following steps:
S101, computer b read original document, by the local file B piecemeals of storage, obtain each of the local file BBlocks of files.It should be noted that the computer b stored is ancient deed i.e. local file B before synchronization, the computerThat a is stored is file A to be synchronized.To observe and utilizing locality, it is necessary to embody the locality characteristic of file, so needs pairFile block.
S102, the computer b calculate check code to each blocks of files of the local file B and generate check code respectivelyTable, the fileinfo of the local file B and verification code table are sent to computer a.
Specifically, the computer b each blocks of files of the local file B is calculated respectively 32 rolling cryptographic Hash andThe cryptographic Hash of 128 simultaneously stores to 16 bit check code Hash tables and verifies code table, by the fileinfo of the local file B and schoolCode table is tested to send to computer a.In the present embodiment the fileinfo include but is not limited to file size, file block size withAnd file block number.The rolling Hash of weak check value i.e. 32 of 32 is calculated using Adler32 algorithms in the present embodimentValue and the cryptographic Hash that strong check value i.e. 128 of 128 are calculated using MD5 algorithms.
S103, the computer a are according to the fileinfo and verification code table of the local file B, foundation adjacent file blockLocality matched with file A to be synchronized, and the obtained failed fileinfo of matching is fed back into the computerb。
Specifically, the computer a is contrasted according to the fileinfo and verification code table of the local file B using traversalThe method of check code is treated synchronous documents A and matched, locality and file A to be synchronized progress according to adjacent file blockMatch somebody with somebody, and the failed fileinfo of obtained matching is fed back into the computer b:
The computer a treats synchronous documents A according to file block size, from corresponding with the blocks of files that file block number is minimumPart starts, the current check part of Backwards selection, and calculate 32 cryptographic Hash in the check code of current check part successively,32 cryptographic Hash are searched in the verification code table, 128 Hash in the check code of current check part are calculated after the match is successfulValue, 128 cryptographic Hash are searched in the verification code table, if the match is successful twice, under adjacent with current check partSecondary check part need not calculate 32 cryptographic Hash, directly calculate 128 cryptographic Hash and search 128 in the verification code tableCryptographic Hash, and the file block number of blocks of files corresponding to current check part and blocks of files positional information are added to matching filesIn information, so as to realize the acceleration of matching process;If it fails to match for current check part, by current check part respective file blockFileinfo be added to and match failed fileinfo, until after searching the file A to be synchronized, will obtainThe failed fileinfo of matching feed back to the computer b.
S104, the computer b receive the failed fileinfo of the matching, and will match failed text with describedBlocks of files corresponding to part information and local file B progress file splicings are synchronous to complete.
Specifically, the computer b receives the failed fileinfo of the matching, and according to the matching files informationBlocks of files corresponding with the fileinfo that the matching is failed is subjected to file splicing to complete together with the local file BStep.
As shown in Fig. 2 based on the above-mentioned method that file difference synchronizing speed is improved using locality, the present embodiment givesThe specific algorithm flow of this method, comprises the following steps:
S201, computer b read original document, by the local file B piecemeals, to obtain local file B each fileBlock;
S202, the computer b calculate each blocks of files of the local file B check code respectively, and by the localThe fileinfo and verification code table of file are sent to computer a;
S203, the computer a treat synchronous documents A according to the file block size, from the file block number mostSmall blocks of files corresponding part starts, successively the current check part of Backwards selection, and calculates in the check code of current check part32 cryptographic Hash, in the check code Hash table search 32 cryptographic Hash;
S204,128 cryptographic Hash in the check code of current check part are calculated, searched in the check code Hash table128 cryptographic Hash.Wherein, carried out under conditions of step S204 is after when step S203 is searched, the match is successful.
S205, the next check part adjacent with current check part need not calculate 32 cryptographic Hash, directly calculate 128Position cryptographic Hash simultaneously searches 128 cryptographic Hash in the check code Hash table, to reach the effect for accelerating matching.Wherein, stepS205 be after when step S204 is searched, the match is successful under conditions of carry out.
S206, the fileinfo of current check part respective file block is added to matched in failed fileinfo.Wherein, the current check part is the check part that it fails to match, and step is after in S203 or S204 lookups, it fails to matchUnder conditions of carry out.
S207, the computer b receive the failed fileinfo of the matching, and will match failed text with describedBlocks of files corresponding to part information and local file B progress file splicings are synchronous to complete.
The local file B piecemeals that a computer b is stored in technical scheme, each blocks of files is distinguishedCheck code is calculated, local file B fileinfo and the verification code table being calculated are sent to computer a, computer a and storedThe to be synchronized file A similar to B, computer a utilizes file locality when verification code division layer is contrasted from outside to insideI.e. the blocks of files adjacent with the blocks of files of matching is also likely to be that this feature of matching accelerates matching process, and then determinesWith failed fileinfo, blocks of files corresponding to these information is sent to the file that will be received to computer b, computer bBlock carries out file splicing to realize file synchronization with file B.The present invention can save a large amount of in the case of file locality is preferableThe time is calculated, the file for being adapted to synchronous comparison in difference to concentrate.
It is to pass through one of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodimentComputer program instructs the hardware of correlation to complete, and described program can be stored in a computer read/write memory medium,The program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic disc,CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random AccessMemory, RAM) etc..
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art thatThe invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes,Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present inventionIt is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, alsoOther more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims (8)

  1. 6. the method according to claim 5 that file difference synchronizing speed is improved using locality, it is characterised in that describedStep S103 is specifically included:The computer a treats synchronous documents A according to file block size, from minimum with file block numberBlocks of files corresponding part starts, successively the current check part of Backwards selection, and calculates 32 in the check code of current check partPosition cryptographic Hash, 32 cryptographic Hash are searched in the verification code table, in the check code that current check part is calculated after the match is successful128 cryptographic Hash, it is described verification code table in search 128 cryptographic Hash, if twice the match is successful, with current check portionThe adjacent next check part of split-phase need not calculate 32 cryptographic Hash, directly calculate 128 cryptographic Hash and in the verification code table128 cryptographic Hash of middle lookup, accelerate matching so as to realize;If it fails to match for current check part, and current check part is correspondingThe fileinfo of blocks of files, which is added to, matches failed fileinfo, until after searching the file A to be synchronized,The failed fileinfo of obtained matching is fed back into computer b.
CN201710708408.8A2017-08-172017-08-17A kind of method that file difference synchronizing speed is improved using localityPendingCN107480267A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201710708408.8ACN107480267A (en)2017-08-172017-08-17A kind of method that file difference synchronizing speed is improved using locality

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710708408.8ACN107480267A (en)2017-08-172017-08-17A kind of method that file difference synchronizing speed is improved using locality

Publications (1)

Publication NumberPublication Date
CN107480267Atrue CN107480267A (en)2017-12-15

Family

ID=60600966

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710708408.8APendingCN107480267A (en)2017-08-172017-08-17A kind of method that file difference synchronizing speed is improved using locality

Country Status (1)

CountryLink
CN (1)CN107480267A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111787074A (en)*2020-06-182020-10-16杭州美创科技有限公司File synchronization method and terminal
CN114385747A (en)*2021-09-222022-04-22国家电网有限公司 Mobile Internet Fast Data Synchronization Method

Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101908077A (en)*2010-08-272010-12-08华中科技大学 A data deduplication method suitable for cloud backup
CN102065098A (en)*2010-12-312011-05-18网宿科技股份有限公司Method and system for synchronizing data among network nodes
CN102831222A (en)*2012-08-242012-12-19华中科技大学Differential compression method based on data de-duplication
CN103685509A (en)*2013-12-122014-03-26深圳市彩讯科技有限公司Method for synchronizing file delta
CN104023085A (en)*2014-06-252014-09-03武汉大学Security cloud storage system based on increment synchronization
CN104243508A (en)*2013-06-072014-12-24富鸿康科技(深圳)有限公司Server, client side and file synchronization method
CN105554081A (en)*2015-12-092016-05-04华为技术有限公司File difference transmission method and device
CN105872017A (en)*2016-03-182016-08-17清华大学Method and apparatus for carrying out file differential encoding synchronization at web page side
US20160267112A1 (en)*2015-03-092016-09-15International Business Machines CorporationFile transfer system using file backup times

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101908077A (en)*2010-08-272010-12-08华中科技大学 A data deduplication method suitable for cloud backup
CN102065098A (en)*2010-12-312011-05-18网宿科技股份有限公司Method and system for synchronizing data among network nodes
CN102831222A (en)*2012-08-242012-12-19华中科技大学Differential compression method based on data de-duplication
CN104243508A (en)*2013-06-072014-12-24富鸿康科技(深圳)有限公司Server, client side and file synchronization method
CN103685509A (en)*2013-12-122014-03-26深圳市彩讯科技有限公司Method for synchronizing file delta
CN104023085A (en)*2014-06-252014-09-03武汉大学Security cloud storage system based on increment synchronization
US20160267112A1 (en)*2015-03-092016-09-15International Business Machines CorporationFile transfer system using file backup times
CN105554081A (en)*2015-12-092016-05-04华为技术有限公司File difference transmission method and device
CN105872017A (en)*2016-03-182016-08-17清华大学Method and apparatus for carrying out file differential encoding synchronization at web page side

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111787074A (en)*2020-06-182020-10-16杭州美创科技有限公司File synchronization method and terminal
CN111787074B (en)*2020-06-182023-04-21杭州美创科技股份有限公司File synchronization method and terminal
CN114385747A (en)*2021-09-222022-04-22国家电网有限公司 Mobile Internet Fast Data Synchronization Method

Similar Documents

PublicationPublication DateTitle
US12361955B2 (en)Audio fingerprinting
CN105530284B (en) File synchronization method
US20090150394A1 (en)Document Merge
CN104866985B (en)The recognition methods of express delivery odd numbers, apparatus and system
US20110004608A1 (en)Combining and re-ranking search results from multiple sources
CN107609106B (en)Similar article searching method, device, equipment and storage medium
US10460203B2 (en)Jaccard similarity estimation of weighted samples: scaling and randomized rounding sample selection with circular smearing
US8370390B1 (en)Method and apparatus for identifying near-duplicate documents
US9449116B2 (en)Online radix tree compression with key sequence skip
KR20140131333A (en)Stream recognition and filtering
JP2011170667A (en)File-synchronizing system, file synchronization method, and file synchronization program
CN107480267A (en)A kind of method that file difference synchronizing speed is improved using locality
JP2012164130A (en)Data division program
CN104809256A (en)Data deduplication method and data deduplication method
Paganelli et al.Parallelizing computations of full disjunctions
AU2018265614A1 (en)Data storage method and apparatus
CN104765831B (en)A kind of generation of dictionary sheet and its application process and device
US9654472B2 (en)Storage count verification system
US9189488B2 (en)Determination of landmarks
CN104765828B (en)A kind of generation of dictionary data table and application process and device
CN111211966A (en)Method and system for storing transmission files in chat tool
CN104765829B (en)A kind of information retrieval method and device
CN113590653B (en) A relationship retrieval method, system, storage medium and terminal device
Zhou et al.A bit string content aware chunking strategy for reduced CPU energy on cloud storage
Butakov et al.Low RAM footprint algorithm for small scale plagiarism detection projects

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20171215

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp