Movatterモバイル変換


[0]ホーム

URL:


CN102541985A - Organization method of client directory cache in distributed file system - Google Patents

Organization method of client directory cache in distributed file system
Download PDF

Info

Publication number
CN102541985A
CN102541985ACN2011103264489ACN201110326448ACN102541985ACN 102541985 ACN102541985 ACN 102541985ACN 2011103264489 ACN2011103264489 ACN 2011103264489ACN 201110326448 ACN201110326448 ACN 201110326448ACN 102541985 ACN102541985 ACN 102541985A
Authority
CN
China
Prior art keywords
client
catalogue
read
directory
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103264489A
Other languages
Chinese (zh)
Inventor
杨浩
常涛
吕明强
邵宗有
刘新春
苗艳超
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co LtdfiledCriticalDawning Information Industry Beijing Co Ltd
Priority to CN2011103264489ApriorityCriticalpatent/CN102541985A/en
Publication of CN102541985ApublicationCriticalpatent/CN102541985A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

The invention discloses an organization method of a client directory cache in a distributed file system, wherein the distributed file system adopts a multi-metadata server framework, i.e. contents of a single directory are distributed on a plurality of metadata servers. The reason why the multi-metadata framework is adopted is mainly that the pressure of metadata access can be decentralized and the concurrency can be improved. According to the method, aiming at the characteristic that people write less but read more in network application, contents of a directory item and corresponding index nodes are remained in the cache of a client side, thereby avoiding the client side from communicating with the servers for multiple times when reading repeatedly; meanwhile when a directory is accessed for the first time, the directory items of the directory distributed on different metadata servers are pre-read, and the file index nodes and the file contents are pre-read according to a default pre-reading strategy or a pre-reading strategy issued by an application program. Consequently when the application program needs to access a certain file under the directory items, the metadata and data of the file are pre-read into the local cache of the client side already, so that the execution speed of the application program is accelerated greatly.

Description

The method for organizing of client directory buffer memory in a kind of distributed file system
Technical field
The present invention relates to directory entry management in the distributed file system, specifically, relate to the method for organizing of client directory buffer memory in a kind of distributed file system.
Background technology
The develop rapidly of Along with computer technology, various application are increasing for the demand of storage, and this is wherein typical with the application of network.The storage demand of network application roughly is divided into two kinds, and a kind of is that big file is main storage demand, uses like audio-video network, and the characteristics of this type application are that number of files is few, but the size of single file normally GB even TB rank; A kind of in addition is main storage demand with the small documents, and like online shopping mall, portal website etc., the characteristics of this type demand are that single file is little; But quantity of documents is huge; Deposit up to ten million files under the common single catalogue, and this class file only writes once usually, later on to be read as the master.
In most of network application, in order to satisfy the demand of storage, distributed file system is introduced in the diverse network application, and that this is wherein representative is NFS, lustre, GPFS etc.The characteristics of this type distributed file system are that the operation for big file has reasonable performance, if but the small documents of enormous amount is arranged under the single catalogue, then the efficient of the catalog item of this type of file system then is difficult to satisfactory.Therefore, a lot of network companies, like Taobao, Netease, Tengxun etc., in order to satisfy own demand, one after another to the small documents design Storage storage architecture of suitable own demand.
In the parallel file system of seeing at present that storage is optimized to small documents, the overwhelming majority adopts single group metadata framework, and client just goes to read on the meta data server when carrying out metadata access usually when needed.Like this; The delay of network will be shone into very big influence the response speed of client; And if the data that client need be visited are not in the internal memory of meta data server; Then also need visit disk, this has influenced the real-time of using with regard to making the access time of application program have a big chunk to be wasted in above the IO.
Summary of the invention
The present invention is intended to disclose the method for organizing of client directory item buffer memory in a kind of distributed file system, and this method can solve the following low problem of mass small documents access efficiency of monocular record in the network application effectively.
The method for organizing of client directory buffer memory in a kind of distributed file system,
Divide the catalogue subclass as required, the directory entry in the single catalogue is carried out Hash operation, store in each catalogue subclass, each catalogue subclass is distributed on the meta data server, and the directory entry buffer structure on the client is organized according to the catalogue subclass.
Preferably, when application need traveled through said catalogue, whether client was at first inquired about local cache and is existed, if exist, then directly returned to the client; If do not exist, then read to meta data server, read completion after, client leaves it in local cache, returns to application then.
Preferably, said reading adopts parallel mode to read.
Preferably, said client can be looked ahead to the file under this catalogue after reading catalogue for the first time.
Preferably, said strategy of looking ahead is: all directory entries under this catalogue are corresponded to read corresponding index node on the meta data server.
Preferably, said order of looking ahead can be sent by answering, and client is read back the index node of this batch file, and then removed the data server prefetch data when receiving prefetch request from meta data server.
In the present invention, distributed file system adopts the multivariate data server architecture, and promptly the distribution of content of single catalogue is on a plurality of meta data servers.Why selecting the framework of multivariate data for use, mainly is in order to disperse the pressure of metadata access, to improve concurrency.Write to network application and to read many characteristics less, the present invention keeps the content and the corresponding index node of directory entry in the buffer memory of client, and needs repeatedly communicated by letter with server when avoiding client repeatedly to read; Simultaneously; When catalogue of maiden visit, the directory entry that is distributed in this catalogue on the different meta data servers walked abreast read in advance, simultaneously; According to acquiescence read in advance that strategy or application program issue read strategy in advance, file inode and file content are read in advance.Like this, when application program needed certain file of access catalog item, metadata of this document and data possibly read in the client terminal local buffer memory in advance, thereby the execution speed of accelerating application greatly.
Embodiment
Elaborate below in conjunction with embodiment:
(1) among the present invention, the directory entry in the single catalogue carries out Hash according to its name earlier, is divided into some subclass, and each subclass is distributed on the meta data server.
(2) the directory entry buffer structure on the client is organized according to the directory entry subclass, promptly the directory entry that is distributed on each metadata is managed respectively, keeps independent each other.
(3) when certain catalogue of application need traversal, whether client inquiry local cache earlier exists, if exist, then directly returns to the user.If buffer memory does not exist, then need read to meta data server.When reading,, therefore adopt parallel mode to read in the invention, can quicken the speed that directory entry reads like this because all directory entries of single catalogue leave on the different meta data servers according to subclass.After directory entry read, client left it in local cache earlier, returns to application program then.
(4) final purpose of a catalogue of application access is the file of visiting under it usually, therefore after the directory entry traversal, and then can have the request of visiting each file under this catalogue to be handed down to the client of file system successively.In order to make full use of the professional time of self handling of application program, among the present invention, file system client can be looked ahead to the file under this catalogue after reading catalogue for the first time.The default policy of looking ahead is all directory entries under this catalogue to be corresponded to read corresponding index node information on the meta data server.Application program also can be according to the characteristics of self; Issue the strategy of reading in advance to client, read a certain batch file in advance like needs, client receive read strategy request in advance after; Can the index node of this batch file be read back from meta data server, and then remove prefetch data on the data server.Like this, when application program need be visited concrete file, the possible data that it needs were through reading to have entered into the local cache of client in advance, thereby can significantly reduce the response time of application program.

Claims (6)

CN2011103264489A2011-10-252011-10-25Organization method of client directory cache in distributed file systemPendingCN102541985A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2011103264489ACN102541985A (en)2011-10-252011-10-25Organization method of client directory cache in distributed file system

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2011103264489ACN102541985A (en)2011-10-252011-10-25Organization method of client directory cache in distributed file system

Publications (1)

Publication NumberPublication Date
CN102541985Atrue CN102541985A (en)2012-07-04

Family

ID=46348888

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2011103264489APendingCN102541985A (en)2011-10-252011-10-25Organization method of client directory cache in distributed file system

Country Status (1)

CountryLink
CN (1)CN102541985A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102819599A (en)*2012-08-152012-12-12华数传媒网络有限公司Method for constructing hierarchical catalogue based on consistent hashing data distribution
CN103150394A (en)*2013-03-252013-06-12中国人民解放军国防科学技术大学Distributed file system metadata management method facing to high-performance calculation
CN103685453A (en)*2013-09-112014-03-26华中科技大学A method for obtaining metadata in a cloud storage system
CN104125253A (en)*2013-04-272014-10-29博雅网络游戏开发(深圳)有限公司Network application realization method and system
CN104239435A (en)*2014-08-292014-12-24四川长虹电器股份有限公司Distributed picture caching method based on picture thumbnail processing
CN104580437A (en)*2014-12-302015-04-29创新科存储技术(深圳)有限公司Cloud storage client and high-efficiency data access method thereof
WO2015176659A1 (en)*2014-05-222015-11-26Huawei Technologies Co., Ltd.System and method for pre-fetching
CN105138545A (en)*2015-07-092015-12-09中国科学院计算技术研究所Method and system for asynchronously pre-reading directory entries in distributed file system
CN105677892A (en)*2016-01-292016-06-15华为技术有限公司Method and device for reading catalog subitem metadata
CN106570113A (en)*2016-10-252017-04-19中国电力科学研究院Cloud storage method and system for mass vector slice data
CN106775994A (en)*2017-02-282017-05-31郑州云海信息技术有限公司The method and device of a kind of metadata cluster catalogue scheduling
CN107066503A (en)*2017-01-052017-08-18郑州云海信息技术有限公司The method and device of magnanimity metadata burst distribution
CN107291870A (en)*2017-06-152017-10-24郑州云海信息技术有限公司Files in batch read method in a kind of distributed storage
CN107491545A (en)*2017-08-252017-12-19郑州云海信息技术有限公司The catalogue read method and client of a kind of distributed memory system
CN108319634A (en)*2017-12-152018-07-24创新科存储技术(深圳)有限公司The directory access method and apparatus of distributed file system
CN110321080A (en)*2019-07-022019-10-11北京计算机技术及应用研究所A kind of warm data pool pre-head method of cross-node
CN110334073A (en)*2019-06-132019-10-15腾讯科技(深圳)有限公司A kind of metadata forecasting method, device, terminal, server and storage medium
CN111258956A (en)*2019-03-222020-06-09深圳市远行科技股份有限公司Method and equipment for pre-reading mass data files facing far end
CN112559574A (en)*2020-12-252021-03-26北京百度网讯科技有限公司Data processing method and device, electronic equipment and readable storage medium
CN112799589A (en)*2021-01-142021-05-14新华三大数据技术有限公司Data reading method and device
CN114218170A (en)*2021-11-242022-03-22新华三技术有限公司成都分公司File reading method and device
CN114647617A (en)*2022-04-182022-06-21中国工商银行股份有限公司File reading method and device, computer equipment, storage medium and program product
CN115510016A (en)*2022-10-212022-12-23济南浪潮数据技术有限公司 A client response method, device and medium based on directory fragmentation

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1614591A (en)*2004-12-022005-05-11中国科学院计算技术研究所Method for organizing and accessing distributive catalogue of document system
CN1692356A (en)*2002-11-142005-11-02易斯龙系统公司Systems and methods for restriping files in a distributed file system
CN102024017A (en)*2010-11-042011-04-20天津曙光计算机产业有限公司Method for traversing directory entries of distribution type file system in repetition-free and omission-free way
CN102024019A (en)*2010-11-042011-04-20曙光信息产业(北京)有限公司Suffix tree based catalog organizing method in distributed file system
CN102024016A (en)*2010-11-042011-04-20天津曙光计算机产业有限公司Rapid data restoration method for distributed file system (DFS)

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1692356A (en)*2002-11-142005-11-02易斯龙系统公司Systems and methods for restriping files in a distributed file system
CN1614591A (en)*2004-12-022005-05-11中国科学院计算技术研究所Method for organizing and accessing distributive catalogue of document system
CN102024017A (en)*2010-11-042011-04-20天津曙光计算机产业有限公司Method for traversing directory entries of distribution type file system in repetition-free and omission-free way
CN102024019A (en)*2010-11-042011-04-20曙光信息产业(北京)有限公司Suffix tree based catalog organizing method in distributed file system
CN102024016A (en)*2010-11-042011-04-20天津曙光计算机产业有限公司Rapid data restoration method for distributed file system (DFS)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102819599A (en)*2012-08-152012-12-12华数传媒网络有限公司Method for constructing hierarchical catalogue based on consistent hashing data distribution
CN102819599B (en)*2012-08-152016-06-01华数传媒网络有限公司The method building hierarchical directory in uncommon data distributed basis is breathed out in consistence
CN103150394B (en)*2013-03-252014-07-23中国人民解放军国防科学技术大学Distributed file system metadata management method facing to high-performance calculation
CN103150394A (en)*2013-03-252013-06-12中国人民解放军国防科学技术大学Distributed file system metadata management method facing to high-performance calculation
CN104125253A (en)*2013-04-272014-10-29博雅网络游戏开发(深圳)有限公司Network application realization method and system
CN104125253B (en)*2013-04-272017-10-24博雅网络游戏开发(深圳)有限公司The method and system of network application
CN103685453A (en)*2013-09-112014-03-26华中科技大学A method for obtaining metadata in a cloud storage system
CN103685453B (en)*2013-09-112016-08-03华中科技大学The acquisition methods of metadata in a kind of cloud storage system
WO2015176659A1 (en)*2014-05-222015-11-26Huawei Technologies Co., Ltd.System and method for pre-fetching
CN106462610A (en)*2014-05-222017-02-22华为技术有限公司 A pre-acquisition system and method
CN104239435A (en)*2014-08-292014-12-24四川长虹电器股份有限公司Distributed picture caching method based on picture thumbnail processing
CN104580437A (en)*2014-12-302015-04-29创新科存储技术(深圳)有限公司Cloud storage client and high-efficiency data access method thereof
CN105138545A (en)*2015-07-092015-12-09中国科学院计算技术研究所Method and system for asynchronously pre-reading directory entries in distributed file system
CN105138545B (en)*2015-07-092018-10-09中国科学院计算技术研究所The asynchronous method and system pre-read of directory entry in a kind of distributed file system
CN105677892A (en)*2016-01-292016-06-15华为技术有限公司Method and device for reading catalog subitem metadata
CN105677892B (en)*2016-01-292018-12-25华为技术有限公司A kind of method and device reading catalogue subitem metadata
CN106570113A (en)*2016-10-252017-04-19中国电力科学研究院Cloud storage method and system for mass vector slice data
CN106570113B (en)*2016-10-252022-04-01中国电力科学研究院Mass vector slice data cloud storage method and system
CN107066503A (en)*2017-01-052017-08-18郑州云海信息技术有限公司The method and device of magnanimity metadata burst distribution
CN106775994A (en)*2017-02-282017-05-31郑州云海信息技术有限公司The method and device of a kind of metadata cluster catalogue scheduling
CN107291870A (en)*2017-06-152017-10-24郑州云海信息技术有限公司Files in batch read method in a kind of distributed storage
CN107491545A (en)*2017-08-252017-12-19郑州云海信息技术有限公司The catalogue read method and client of a kind of distributed memory system
CN108319634A (en)*2017-12-152018-07-24创新科存储技术(深圳)有限公司The directory access method and apparatus of distributed file system
CN108319634B (en)*2017-12-152021-08-06深圳创新科技术有限公司 Directory access method and device for distributed file system
CN111258956A (en)*2019-03-222020-06-09深圳市远行科技股份有限公司Method and equipment for pre-reading mass data files facing far end
CN111258956B (en)*2019-03-222023-11-24深圳市远行科技股份有限公司Method and device for prereading far-end mass data files
CN110334073A (en)*2019-06-132019-10-15腾讯科技(深圳)有限公司A kind of metadata forecasting method, device, terminal, server and storage medium
CN110321080A (en)*2019-07-022019-10-11北京计算机技术及应用研究所A kind of warm data pool pre-head method of cross-node
CN112559574A (en)*2020-12-252021-03-26北京百度网讯科技有限公司Data processing method and device, electronic equipment and readable storage medium
CN112559574B (en)*2020-12-252023-10-13北京百度网讯科技有限公司Data processing method, device, electronic equipment and readable storage medium
CN112799589A (en)*2021-01-142021-05-14新华三大数据技术有限公司Data reading method and device
CN114218170A (en)*2021-11-242022-03-22新华三技术有限公司成都分公司File reading method and device
CN114647617A (en)*2022-04-182022-06-21中国工商银行股份有限公司File reading method and device, computer equipment, storage medium and program product
CN115510016A (en)*2022-10-212022-12-23济南浪潮数据技术有限公司 A client response method, device and medium based on directory fragmentation

Similar Documents

PublicationPublication DateTitle
CN102541985A (en)Organization method of client directory cache in distributed file system
CN103020315B (en)A kind of mass small documents storage means based on master-salve distributed file system
Dong et al.An optimized approach for storing and accessing small files on cloud storage
US9710535B2 (en)Object storage system with local transaction logs, a distributed namespace, and optimized support for user directories
TWI472935B (en)Scalable segment-based data de-duplication system and method for incremental backups
CN101567003B (en) Resource Management and Allocation Method in Parallel File System
CN104536959B (en)A kind of optimization method of Hadoop accessing small high-volume files
CN102662992B (en)Method and device for storing and accessing massive small files
CN106066896B (en) An application-aware big data deduplication storage system and method
CN109376156B (en)Method for reading hybrid index with storage awareness
CN105868286B (en) Parallel append method and system based on small file merging in distributed file system
CN105183839A (en)Hadoop-based storage optimizing method for small file hierachical indexing
US20120290595A1 (en)Super-records
CN102385623B (en)Catalogue access method in DFS (distributed file system)
CN104850572A (en)HBase non-primary key index building and inquiring method and system
CN106909651A (en)A kind of method for being write based on HDFS small documents and being read
CN103139300A (en)Virtual machine image management optimization method based on data de-duplication
CN102024019B (en)Suffix tree based catalog organizing method in distributed file system
CN102521419A (en)Hierarchical storage realization method and system
CN103294413B (en)Support the distributed memory real-time storage device and method of magnanimity acquisition terminal
CN103559229A (en)Small file management service (SFMS) system based on MapFile and use method thereof
CN106471501A (en) Data query method, data object storage method and data system
CN104111898A (en)Hybrid storage system based on multidimensional data similarity and data management method
CN103942301B (en)Distributed file system oriented to access and application of multiple data types
CN102073690B (en)Method for constructing memory database supporting historical Key information

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C12Rejection of a patent application after its publication
RJ01Rejection of invention patent application after publication

Application publication date:20120704


[8]ページ先頭

©2009-2025 Movatter.jp