技术领域technical field
本发明涉及计算机应用技术领域,特别是涉及一种实现存储系统自动精简的方法。The invention relates to the field of computer application technology, in particular to a method for realizing automatic compaction of a storage system.
背景技术Background technique
自动精简配置是一种容量分配技术,它不是一次性将存储空间分配给应用,而是根据应用需求的增长,逐渐增加分配给应用的存储空间。传统的自动精简配置技术给操作系统提供大容量的虚拟驱动器。虚拟驱动器的实际容量并不大,但是虚拟驱动器向操作系统报告其拥有一个更大的容量。操作系统在该虚拟驱动器上可以格式化一个更大容量的文件系统。在自动精简配置存储系统中,只有当使用容量的增加时才需要增加物理资源。这种方式延迟了物理资源部署时间,提高了设备利用率。Thin provisioning is a capacity allocation technology. It does not allocate storage space to applications at one time, but gradually increases the storage space allocated to applications according to the growth of application requirements. Traditional thin provisioning techniques provide large-capacity virtual drives to the operating system. The actual capacity of the virtual drive is not large, but the virtual drive reports to the operating system that it has a larger capacity. The operating system can format a larger-capacity file system on the virtual drive. In a thin-provisioned storage system, physical resources need to be increased only when capacity increases. This method delays the deployment time of physical resources and improves equipment utilization.
自动精简配置技术的核心是“欺骗”操作系统。操作系统无法区分虚拟磁盘容量或实际磁盘容量,因此需要虚拟驱动器监控操作系统的IO(Input/Output,输入输出)请求,并在实际使用资源接近阈值的时候扩展存储空间。为了实现该目的,现有的自动精简配置技术需要管理存储资源池,监控系统IO请求,并在空间不足时分配对应的存储资源。在有些系统中,还实现了资源的回收。现有的专利包括自动精简配置资源分配和管理方面的,如“一种存储系统自动精简配置存储池及组织管理的方法”和“一种实现存储系统自动精简配置异步全额分配的方法”。以及系统实现方式方面的,如“实现存储系统自动精简配置动态扩容的系统及方法”。The core of thin provisioning technology is to "trick" the operating system. The operating system cannot distinguish between the virtual disk capacity and the actual disk capacity, so a virtual driver is required to monitor the IO (Input/Output, input and output) requests of the operating system and expand the storage space when the actual resources used are close to the threshold. To achieve this purpose, existing thin provisioning technologies need to manage storage resource pools, monitor system IO requests, and allocate corresponding storage resources when space is insufficient. In some systems, resource recycling is also implemented. Existing patents include thin provisioning resource allocation and management, such as "a storage system thin provisioning storage pool and its organizational management method" and "a method for realizing asynchronous full allocation of storage system thin provisioning". As well as system implementation methods, such as "A system and method for realizing automatic thin provisioning and dynamic expansion of a storage system".
文献1“申请公开号是CN103020201A的中国发明专利”公开了一种存储系统自动精简配置存储池及组织管理的方法。该方法用于存储系统中,提供一种对自动精简配置存储池空间进行管理和操作的系统架构,将存储池元数据信息单独存储在元数据设备中,让存储池独占存储池设备,将存储池设备成为数据设备,重新定义存储池元数据信息为包含元数据设备信息和数据设备信息的存储池元数据。专利所述方法存在以下3个主要问题:第一,存储池独占存储设备,要求存储池空间固定,这就需要一次性将物理设备全部添加至存储池中,从而造成物理资源的空闲与浪费,不能动态增减存储池,这是该专利的主要技术问题;第二,存储池元数据所在元数据设备存在单点故障,可用性不高;第三,由于是对存储池进行自动精简,当需要扩容时需要专业人员进行操作,且扩容后得到的是块设备,仍然需要进一步格式化为特定文件系统,过程繁冗。Document 1 "Chinese Invention Patent Application Publication No. CN103020201A" discloses a storage system automatic thin provisioning storage pool and organization management method. This method is used in a storage system to provide a system architecture for managing and operating thin provisioning storage pool space. The metadata information of the storage pool is stored separately in the metadata device, and the storage pool exclusively occupies the storage pool device. Pool devices become data devices, and storage pool metadata information is redefined as storage pool metadata including metadata device information and data device information. The method described in the patent has the following three main problems: first, the storage pool monopolizes the storage device and requires a fixed storage pool space, which requires all physical devices to be added to the storage pool at one time, resulting in idleness and waste of physical resources. The main technical problem of this patent is that the storage pool cannot be dynamically increased or decreased; second, there is a single point of failure in the metadata device where the metadata of the storage pool resides, and the availability is not high; third, since the storage pool is automatically thinned, when needed Expansion requires professionals to operate, and the obtained block device after expansion still needs to be further formatted into a specific file system, and the process is cumbersome.
文献2“申请公开号是CN103744622A的中国发明专利”公开了一种实现存储系统自动精简配置异步全额分配的方法。该方法包括存储池和全额分配卷,存储池空间占用实际物理空间,全额分配卷是通过虚拟映射提供给操作系统的虚拟驱动器,并占用实际物理空间,由存储池向全额分配卷提供存储空间,用户不用等全额分配卷的实际存储空间完全分配完成,就可以开始使用该全额分配卷,并根据用户指定的逻辑卷容量大小,一次性将存储池空间对应的空间容量异步的分配给全额分配卷。专利所述方法存在以下3个主要问题:第一,由于动态分配建立在全额分配卷之上,要求存储池空间固定,这就需要一次性将物理设备全部添加至存储池中,从而造成物理资源的空闲与浪费,不能动态增减存储池;第二,通过在存储池和用户逻辑卷之间增加全额分配卷以实现自动精简,适应性不强;第三,资源池与全额分配卷的设定将使用者与管理者隔离,当用户空间不足时,只能从全额分配卷中扩容,而资源池空间大小扩容只能由管理者完成,这使得自动精简过程更加复杂。Document 2 "Chinese Invention Patent Application Publication No. CN103744622A" discloses a method for implementing thin provisioning and asynchronous full allocation of storage systems. This method includes a storage pool and a full allocation volume. The storage pool space occupies the actual physical space. The full allocation volume is a virtual drive provided to the operating system through virtual mapping and occupies the actual physical space, which is provided by the storage pool to the full allocation volume. Storage space, the user does not need to wait for the actual storage space of the fully allocated volume to be fully allocated before starting to use the fully allocated volume, and according to the size of the logical volume specified by the user, the space capacity corresponding to the storage pool space will be allocated asynchronously at one time Allocated to the full allocation volume. The method described in the patent has the following three main problems: First, since dynamic allocation is based on fully allocated volumes, the space in the storage pool is required to be fixed, which requires all physical devices to be added to the storage pool at one time, resulting in physical The idleness and waste of resources cannot dynamically increase or decrease the storage pool; second, by adding a full allocation volume between the storage pool and the user logical volume to achieve automatic thinning, the adaptability is not strong; third, resource pools and full allocation The volume setting isolates the user from the administrator. When the user space is insufficient, the capacity can only be expanded from the fully allocated volume, while the expansion of the resource pool space can only be completed by the administrator, which makes the automatic thinning process more complicated.
文献3“授权公告号是CN102855093B的中国发明专利”公开了一种实现存储系统自动精简配置动态扩容的系统及方法,该系统包括:扩容信息获取模块,用于获得用户传入的扩容命令和扩容大小,并传给扩容信息解析模块;扩容信息解析模块,用于将获得的扩容命令细分为扩容挂起命令和扩容恢复命令,并传给IO重定向层扩容模块;IO重定向层扩容模块,用于向自动精简配置扩容模块发送扩容挂起命令和扩容大小及向存储池恢复模块发送扩容恢复命令;自动精简配置扩容模块,用于根据接收的扩容大小对存储系统的存储池进行元数据扩容操作,并向存储池恢复模块发送元数据;存储池恢复模块,用于根据接收的元数据和扩容恢复命令重新激活存储池。专利所述系统及方法存在以下2个主要问题:第一,存储空间一旦设定变无法更改,这导致存储空间无法动态扩容,同时也造成了存储空间中未使用物理设备资源的浪费,这是该方法的主要问题;第二,在对用户使用的逻辑卷进行扩容时,需要先扩容挂起,再扩容恢复,这导致存储服务在扩容时不可用。Document 3 "Chinese invention patent whose authorization announcement number is CN102855093B" discloses a system and method for realizing automatic thin provisioning and dynamic expansion of storage system. The system includes: expansion information acquisition module, which is used to obtain expansion commands and expansion size, and pass it to the expansion information analysis module; the expansion information analysis module is used to subdivide the obtained expansion command into an expansion suspension command and an expansion recovery command, and pass it to the IO redirection layer expansion module; the IO redirection layer expansion module , used to send the expansion suspend command and expansion size to the thin provisioning expansion module and the expansion recovery command to the storage pool recovery module; the thin provisioning expansion module is used to perform metadata on the storage pool of the storage system according to the received expansion size expansion operation, and send metadata to the storage pool recovery module; the storage pool recovery module is used to reactivate the storage pool according to the received metadata and expansion recovery command. The system and method described in the patent have the following two main problems: First, once the storage space is set, it cannot be changed, which leads to the inability to dynamically expand the storage space, and also causes a waste of unused physical device resources in the storage space, which is The main problem of this method; secondly, when expanding the logical volume used by the user, it is necessary to suspend the expansion first, and then resume the expansion, which causes the storage service to be unavailable during the expansion.
现有的自动精简配置技术主要实现在块设备层,通过虚拟更大的磁盘空间“欺骗”操作系统,并随着实际使用空间增长逐步增加存储资源。现有的自动精简配置系统主要分为存储资源池和多个精简卷。为了记录资源的使用情况,自动精简配置管理系统需要保存额外的元数据,确保资源按需分配,并且保证多个精简卷间无资源使用冲突。Existing thin provisioning technologies are mainly implemented at the block device layer, "cheat" the operating system by virtualizing larger disk space, and gradually increase storage resources as the actual space used grows. Existing thin provisioning systems are mainly divided into storage resource pools and multiple thin volumes. In order to record resource usage, the thin provisioning management system needs to save additional metadata to ensure that resources are allocated on demand and that there are no resource usage conflicts between multiple thin volumes.
现有的方式主要实现在块设备层,主要存在以下缺点:1.资源的使用者(操作系统)和管理者(自动精简配置系统)分离,需要额外的数据和机制保证资源可用。2.操作系统不知道块设备层可用空间相关信息,如使用了尚未分配或映射的存储空间,将造成严重的错误。3.自动精简配置技术用较小的存储空间虚拟更大的存储空间,操作系统使用时对逻辑地址访问和使用并非完全连续,所以需要将虚拟驱动器的逻辑地址和实际物理地址进行映射。4.现有自动精简配置技术容易实现空间的扩展,但是难以实现无用空间的识别和回收。Existing methods are mainly realized at the block device layer, and mainly have the following disadvantages: 1. The resource user (operating system) and the manager (thin provisioning system) are separated, and additional data and mechanisms are required to ensure that resources are available. 2. The operating system does not know the information about the available space of the block device layer. If the storage space that has not been allocated or mapped is used, it will cause a serious error. 3. Thin provisioning technology uses a smaller storage space to virtualize a larger storage space. When the operating system is used, the logical address access and use are not completely continuous, so it is necessary to map the logical address and the actual physical address of the virtual drive. 4. The existing thin provisioning technology is easy to realize the expansion of space, but it is difficult to realize the identification and recovery of useless space.
发明内容Contents of the invention
为了克服现有存储系统自动精简方法实用性差的不足,本发明提供一种实现存储系统自动精简的方法。该方法统一资源的使用者和管理者,通过在文件系统元数据中增加控制信息,确保数据存储的按需分配,并且保证数据不会存放到未分配空间,结合云存储系统提供的API,实现在线存储空间的扩容和使用,达到自动精简配置的目的。重新定义文件系统元数据信息,使其包含设备信息和地址映射信息;在文件系统中将逻辑空间分为已分配空间和未分配空间,并进行地址映射管理;通过监测读写过程,确保读写操作不会访问到未分配空间。本发明能够方便地管理存储空间,实现存储空间的动态伸缩,实用性强。In order to overcome the disadvantage of poor practicability of the existing storage system automatic reduction method, the present invention provides a method for realizing storage system automatic reduction. This method unifies the users and managers of resources. By adding control information in the file system metadata, it ensures the allocation of data storage on demand and ensures that data will not be stored in unallocated space. Combined with the API provided by the cloud storage system, it realizes The expansion and use of online storage space achieves the purpose of automatic thin provisioning. Redefine the metadata information of the file system to include device information and address mapping information; divide the logical space into allocated space and unallocated space in the file system, and perform address mapping management; by monitoring the read and write process, ensure that the read and write The operation does not access unallocated space. The invention can conveniently manage the storage space, realize the dynamic expansion and contraction of the storage space, and has strong practicability.
本发明解决其技术问题所采用的技术方案是:一种实现存储系统自动精简的方法,其特点是采用以下步骤:The technical scheme that the present invention adopts to solve its technical problem is: a kind of method for realizing the automatic streamlining of storage system, its characteristic is to adopt the following steps:
文件系统元数据管理模块用于定义元数据的布局及格式。通过所述元数据管理多个块存储设备,并将多个块存储设备组织为统一的文件系统向外提供服务。操作系统及应用程序调用统一的文件系统,统一的文件系统提供IO读写方法,并在其中实现IO读写监测模块。IO读写监测模块通过监测IO写请求,给IO写操作分配可用的block空间。在此监测过程中判断是否有可用的block空间,本次分配的block空间是否已达到了预警的要求。存储空间监测模块和IO读写监测模块共同处理block空间不足的情况。在使用空间达到全部可使用空间预先设定的阈值后,IO监测模块向存储空间监测模块发送消息,存储空间监测模块根据预定的策略,向用户发送警报信息或者通过云存储或者SAN存储区域网络提供的接口自动申请块存储资源,并利用存储空间伸缩调整模块自动对统一的文件系统进行扩容。统一的文件系统伸缩调整模块通过修改文件系统的元数据,达到文件系统扩容的功能,在统一的文件系统伸缩调整虚拟空间的大小。The file system metadata management module is used to define the layout and format of metadata. Multiple block storage devices are managed through the metadata, and multiple block storage devices are organized into a unified file system to provide external services. The operating system and application program call the unified file system, and the unified file system provides IO reading and writing methods, and implements the IO reading and writing monitoring module in it. The IO read and write monitoring module allocates available block space for IO write operations by monitoring IO write requests. During this monitoring process, it is judged whether there is available block space, and whether the block space allocated this time has reached the requirement of early warning. The storage space monitoring module and the IO reading and writing monitoring module jointly handle the situation of insufficient block space. After the used space reaches the preset threshold of all available space, the IO monitoring module sends a message to the storage space monitoring module, and the storage space monitoring module sends an alarm message to the user according to a predetermined strategy or provides The interface automatically applies for block storage resources, and uses the storage space scaling adjustment module to automatically expand the unified file system. The unified file system expansion and adjustment module achieves the function of file system expansion by modifying the metadata of the file system, and adjusts the size of the virtual space in the unified file system.
所述文件系统由多个不同的block group组成,第一个block group中保存超级块信息,其中保存block group数量和位置信息。超级块也在若干其他block group中进行备份。每个glock group包括Group Descriptors,数据块bitmap,inode bitmap,以及存放具体数据的inode表和data表。其中Group Descriptors包括组信息组类型、设备信息、下一个组的位置、预警flag及其他常规信息。The file system is composed of a plurality of different block groups, and the first block group stores super block information, wherein the number and location information of the block groups are stored. Superblocks are also backed up in several other block groups. Each glock group includes Group Descriptors, data block bitmap, inode bitmap, and inode table and data table for storing specific data. Among them, Group Descriptors include group information group type, device information, the location of the next group, early warning flag and other general information.
Group type分为normal和dummy两种。Dummy block group仅记录虚拟空间的大小,其中空间大小使用Group info中的块大小和数量确定,但是其中并不保存inode和data的各种信息。根据预定的预警阈值,当使用到某些块时说明已用空间与可用空间比例达到所述的预警阈值,需要通知管理员处理或自动申请存储资源并添加。Group type is divided into normal and dummy two. Dummy block group only records the size of the virtual space, where the size of the space is determined by the block size and number in Group info, but it does not save various information of inode and data. According to the predetermined warning threshold, when some blocks are used, it means that the ratio of the used space to the available space reaches the said warning threshold, and the administrator needs to be notified for processing or automatically apply for storage resources and add them.
根据每个块的空间大小,每个block group都保存了本block group的块大小和数量,并保存了逻辑上相邻的下一个block group的位置。其中最后一个block group保存虚拟空间的信息,其中仅有块大小和块数量有效,但是并无实际的存储空间。该block group的next block group为空。According to the space size of each block, each block group saves the block size and number of the block group, and saves the position of the logically adjacent next block group. The last block group saves the information of the virtual space, in which only the block size and the number of blocks are valid, but there is no actual storage space. The next block group of the block group is empty.
多个磁盘上部署该文件系统与单个磁盘类似,文件系统由多个block group组织而成。最后一个block group的next block group为空。前面的磁盘中block group依次指向后续的block group,这些block group组成的逻辑空间是连续的。Deploying the file system on multiple disks is similar to a single disk, and the file system is organized by multiple block groups. The next block group of the last block group is empty. The block groups in the previous disk point to the subsequent block groups in turn, and the logical space formed by these block groups is continuous.
当需要动态向所述文件系统中增加存储设备时,首先在新的磁盘上分配blockgroup,最后一个block group为dummy block group。读取旧的dummy block group信息,根据原有虚拟空间减去新增可用空间得到新的dummy block group的信息。找到原来文件系统的倒数第二块block group,将其next指针指向新磁盘的第一个block group。根据增加的block group数量修改超级块中的相关信息。根据可用空间、已用空间以及预警阈值,调整alert block group的位置。清除旧的alert block group的alert flag,并根据计算所得结果,找到新的阈值所在block group,并设置其alert flag。When it is necessary to dynamically add a storage device to the file system, a blockgroup is first allocated on a new disk, and the last block group is a dummy block group. Read the old dummy block group information, and subtract the new available space from the original virtual space to get the new dummy block group information. Find the penultimate block group of the original file system, and point its next pointer to the first block group of the new disk. Modify the relevant information in the super block according to the increased number of block groups. Adjust the position of the alert block group according to the available space, used space, and warning threshold. Clear the alert flag of the old alert block group, and according to the calculated result, find the block group where the new threshold is located, and set its alert flag.
在向系统中增加存储设备完成后,需要向该系统中增加文件系统大小。增加文件系统大小仅增加虚拟空间的大小,所以只需修改dummy block group的相关信息,增加虚拟block数量即可。After adding storage devices to the system, the file system size needs to be added to the system. Increasing the size of the file system only increases the size of the virtual space, so you only need to modify the relevant information of the dummy block group and increase the number of virtual blocks.
在文件系统向使用者提供服务前,需要设定预警用的aler block group,并设置对应的block group的alert flag。Before the file system provides services to users, it is necessary to set the alert block group for alerting and set the alert flag of the corresponding block group.
在使用者扩展文件或写文件时,需要分配空间时,首先定位文件inode所在的block group,判断其上是否有足够的空闲块。如果有空闲块,判断是否是普通的blockgroup。如果是普通的block group,则直接分配空间写入数据。如果是alert block group,说明可用空间数量不足,需要发送alert请求或自动增加设备随后再分配block进行读写。如果当前block group没有足够的空间,则移到下一个block group。判断该group是否是dummy group。如果不是判断该group是否有足够的空闲块。如果是dummy group说明目前没有可用的空间,需要发送alert或自动增加设备。When the user expands or writes a file and needs to allocate space, first locate the block group where the file inode is located, and determine whether there are enough free blocks on it. If there is a free block, determine whether it is a normal blockgroup. If it is an ordinary block group, directly allocate space to write data. If it is an alert block group, it means that the amount of free space is insufficient. You need to send an alert request or automatically increase the device and then allocate blocks for reading and writing. If the current block group does not have enough space, move to the next block group. Determine whether the group is a dummy group. If not, judge whether the group has enough free blocks. If it is a dummy group, it means that there is currently no available space, and an alert needs to be sent or the device will be automatically added.
本发明的有益效果是:该方法统一资源的使用者和管理者,通过在文件系统元数据中增加控制信息,确保数据存储的按需分配,并且保证数据不会存放到未分配空间,结合云存储系统提供的API,实现在线存储空间的扩容和使用,达到自动精简配置的目的。重新定义文件系统元数据信息,使其包含设备信息和地址映射信息;在文件系统中将逻辑空间分为已分配空间和未分配空间,并进行地址映射管理;通过监测读写过程,确保读写操作不会访问到未分配空间。本发明能够方便地管理存储空间,实现存储空间的动态伸缩,实用性强。The beneficial effects of the present invention are: the method unifies the users and managers of resources, and ensures the on-demand distribution of data storage by adding control information in the file system metadata, and ensures that data will not be stored in unallocated space. The API provided by the storage system realizes the expansion and use of online storage space and achieves the purpose of automatic thin provisioning. Redefine the metadata information of the file system to include device information and address mapping information; divide the logical space into allocated space and unallocated space in the file system, and perform address mapping management; by monitoring the read and write process, ensure that the read and write The operation does not access unallocated space. The invention can conveniently manage the storage space, realize the dynamic expansion and contraction of the storage space, and has strong practicability.
下面结合附图和具体实施方式对本发明作详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
附图说明Description of drawings
图1是本发明方法中存储系统模块构成图。Fig. 1 is a block diagram of a storage system in the method of the present invention.
图2是本发明方法中文件系统元数据管理模块元数据结构图。Fig. 2 is a metadata structure diagram of the file system metadata management module in the method of the present invention.
图3是本发明方法中文件系统中普通Block Group结构图。Fig. 3 is a structural diagram of a common Block Group in the file system in the method of the present invention.
图4是本发明方法中文件系统中预警Block Group结构图。Fig. 4 is a structural diagram of an early warning Block Group in the file system in the method of the present invention.
图5是本发明方法中文件系统中空Block Group结构图。Fig. 5 is a structural diagram of an empty Block Group in the file system in the method of the present invention.
图6是本发明方法中文件系统在单个磁盘上的布局示意图。Fig. 6 is a schematic diagram of the layout of the file system on a single disk in the method of the present invention.
图7是本发明方法中文件系统在多个磁盘上的布局示意图。Fig. 7 is a schematic diagram of the layout of the file system on multiple disks in the method of the present invention.
图8是本发明方法中存储空间伸缩调整的流程图。Fig. 8 is a flow chart of storage space scaling adjustment in the method of the present invention.
图9是本发明方法中文件系统伸缩调整的流程图。FIG. 9 is a flowchart of file system scaling adjustment in the method of the present invention.
图10是本发明方法中存储空间监测模块的流程图。Fig. 10 is a flow chart of the storage space monitoring module in the method of the present invention.
图11是本发明方法中读写模块进行读写监测和空间监测的流程图。Fig. 11 is a flowchart of reading and writing monitoring and space monitoring performed by the reading and writing module in the method of the present invention.
图12是本发明方法中各种术语关系及说明图。Fig. 12 is a diagram showing the relationship and explanation of various terms in the method of the present invention.
具体实施方式detailed description
参照图1-12。本发明实现存储系统自动精简的方法具体步骤如下:Refer to Figure 1-12. The specific steps of the method for realizing the automatic streamlining of the storage system in the present invention are as follows:
本发明方法所用系统包括:(1)文件系统元数据管理模块;(2)IO读写监测模块,其中IO指的是Input/Output,输入与输出;(3)存储空间监测模块;(4)文件系统伸缩调整模块;(5)存储空间伸缩调整模块。各部分的作用如下:The used system of the inventive method comprises: (1) file system metadata management module; (2) IO reading and writing monitoring module, wherein IO refers to Input/Output, input and output; (3) storage space monitoring module; (4) A file system scaling adjustment module; (5) a storage space scaling adjustment module. The functions of each part are as follows:
文件系统元数据管理模块,提供文件和存储设备元数据组织形式;The file system metadata management module provides file and storage device metadata organization forms;
IO读写监测模块监测IO读写请求,根据文件系统元数据管理模块中提供的元数据信息查找可用块并存储数据;The IO read and write monitoring module monitors the IO read and write requests, searches for available blocks and stores data according to the metadata information provided in the file system metadata management module;
存储空间监测模块监测空间使用情况,如空间使用率达到预定的阈值,通过各种方式通知管理员增加存储空间,或通过预定的命令自动增加存储空间;The storage space monitoring module monitors the space usage. If the space usage reaches a predetermined threshold, it will notify the administrator to increase the storage space in various ways, or automatically increase the storage space through a predetermined command;
文件系统伸缩调整模块提供增加,删除,修改文件系统元数据的方法,可实现文件系统逻辑空间的增删;The file system scaling adjustment module provides methods for adding, deleting, and modifying file system metadata, which can realize the addition and deletion of file system logical space;
存储空间伸缩调整模块提供增加,删除,修改文件系统中地址映射元数据的方法,可实现文件系统管理的物理空间的增删。The storage space expansion and adjustment module provides methods for adding, deleting, and modifying address mapping metadata in the file system, which can realize the addition and deletion of the physical space managed by the file system.
本发明公开的一种自动精简配置存储系统的方法,该方法通过在文件系统中保存管理存储设备、物理空间和逻辑空间所需的元数据,向操作系统提供一个更大空间的文件系统。该文件系统的实际可用空间小于物理可用空间,当实际空间使用超过一定阈值后,该系统可向用户报警,也可根据预定的策略和命令自动向其他的存储系统申请块存储资源并添加到系统中。The invention discloses a method for automatically thinning and provisioning a storage system. The method provides a file system with a larger space for the operating system by saving metadata required for managing storage devices, physical space and logical space in the file system. The actual available space of the file system is smaller than the physical available space. When the actual space usage exceeds a certain threshold, the system can alert the user, and can also automatically apply for block storage resources from other storage systems and add them to the system according to predetermined policies and commands. middle.
参照图12。文件系统使用的多个磁盘容量之和是物理空间。文件系统向操作系统报告的空间大小是逻辑空间,其中包括已分配存储资源的可用空间,和未分配存储资源的虚拟空间。可用空间应小于等于物理空间。在开始使用文件系统存储数据后,已存储数据所占空间为已用空间。预警值表示用空间占可用空间达到特定比例后开始报警提示用户增加空间。本发明中提出的文件系统伸缩是指调整虚拟空间大小。存储空间伸缩调整是调整可用空间的大小。Refer to Figure 12. The sum of the capacities of multiple disks used by the file system is the physical space. The space size reported by the file system to the operating system is the logical space, which includes the free space of allocated storage resources and the virtual space of unallocated storage resources. Free space should be less than or equal to physical space. After you start using the file system to store data, the space occupied by the stored data is the used space. The warning value indicates that when the used space accounts for a certain proportion of the available space, an alarm will be issued to prompt the user to increase the space. The scaling of the file system proposed in the present invention refers to adjusting the size of the virtual space. Storage space scaling adjustment is to adjust the size of the available space.
参照图1。该系统中文件系统元数据管理模块是所有其他功能的基础,该模块定义了元数据的布局及格式。通过这些元数据可以管理多个块存储设备,并将其组织为统一的文件系统向外提供服务。操作系统及应用程序通过系统调用使用本发明所涉及的文件系统,文件系统提供IO读写方法,并在其中实现IO读写监测模块。IO读写监测模块通过监测IO写请求,给IO写操作分配可用的block空间。在此过程中判断是否有可用的block空间,本次分配的block空间是否已达到了预警的要求。存储空间监测模块主要是和IO读写监测模块共同处理block空间不足的情况。在使用空间达到全部可使用空间的一定比例(预先设定的阈值,如80%)后,IO监测模块向存储空间监测模块发送消息,存储空间监测模块根据预定的策略,向用户发送警报信息或者通过其他存储系统(云存储,SAN存储区域网络等)提供的接口自动申请块存储资源,并利用存储空间伸缩调整模块自动对本系统进行扩容。文件系统伸缩调整模块是通过修改文件系统的元数据,达到文件系统扩容的功能,在此文件系统伸缩调整虚拟空间的大小。如本来文件系统大小为100GB,在不增加额外存储设备的前提下,通过调整操作虚拟空间的大小,操作系统可看到文件系统大小为200GB。存储空间伸缩调整模块是通过修改文件系统的元数据,达到增加和删除可用空间的目的。Refer to Figure 1. The file system metadata management module in the system is the basis of all other functions, and this module defines the layout and format of metadata. Through these metadata, multiple block storage devices can be managed and organized into a unified file system to provide external services. The operating system and the application program use the file system involved in the present invention through system calls, and the file system provides an IO reading and writing method, and implements an IO reading and writing monitoring module therein. The IO read and write monitoring module allocates available block space for IO write operations by monitoring IO write requests. During this process, it is judged whether there is available block space, and whether the block space allocated this time has reached the warning requirement. The storage space monitoring module mainly works with the IO read and write monitoring module to deal with insufficient block space. After the used space reaches a certain proportion (pre-set threshold, such as 80%) of the total usable space, the IO monitoring module sends a message to the storage space monitoring module, and the storage space monitoring module sends an alarm message to the user according to a predetermined strategy or Automatically apply for block storage resources through the interfaces provided by other storage systems (cloud storage, SAN storage area network, etc.), and use the storage space scaling adjustment module to automatically expand the capacity of the system. The file system expansion and adjustment module achieves the function of file system expansion by modifying the metadata of the file system, where the file system expands and adjusts the size of the virtual space. If the original file system size is 100GB, the operating system can see that the file system size is 200GB by adjusting the size of the operating virtual space without adding additional storage devices. The storage space expansion and scaling adjustment module achieves the purpose of increasing and deleting available space by modifying the metadata of the file system.
参照图2。该文件系统由多个不同的block group组成,第一个block group中保存超级块信息,其中保存block group数量和位置信息。超级块也在若干其他block group中进行备份。每个block group包括Group Descriptors,数据块bitmap,inode bitmap,以及存放具体数据的inode表和data表。其中Group Descriptors包括组信息(块大小及数量等),组类型,设备信息,下一个组的位置,预警flag及其他常规信息。Refer to Figure 2. The file system consists of multiple different block groups. The first block group stores super block information, which stores the number and location information of the block group. Superblocks are also backed up in several other block groups. Each block group includes Group Descriptors, data block bitmap, inode bitmap, and inode table and data table for storing specific data. Among them, Group Descriptors include group information (block size and quantity, etc.), group type, device information, location of the next group, warning flag and other general information.
参照图3、4和5。Group type分为normal和dummy两种。Dummy block group仅记录虚拟空间的大小,其中空间大小使用Group info中的块大小和数量确定,但是其中并不保存inode和data的各种信息。根据预定的预警阈值,当使用到某些块时说明已用空间与可用空间比例达到该阈值,需要通知管理员处理或自动申请存储资源并添加。Refer to Figures 3, 4 and 5. Group type is divided into normal and dummy two. Dummy block group only records the size of the virtual space, where the size of the space is determined by the block size and number in Group info, but it does not save various information of inode and data. According to the preset early warning threshold, when some blocks are used, it means that the ratio of used space to available space reaches the threshold, and the administrator needs to be notified for processing or automatically apply for storage resources and add them.
参照图6。根据每个块的空间大小,每个block group都保存了本block group的块大小和数量,并保存了逻辑上相邻的下一个block group的位置。其中最后一个blockgroup保存虚拟空间的信息,其中仅有块大小和块数量有效,但是并无实际的存储空间。该block group的next block group为空。Refer to Figure 6. According to the space size of each block, each block group saves the block size and number of the block group, and saves the position of the logically adjacent next block group. The last blockgroup saves the information of the virtual space, in which only the block size and the number of blocks are valid, but there is no actual storage space. The next block group of the block group is empty.
参照图7。多个磁盘上部署该文件系统与单个磁盘的例子类似,文件系统由多个block group组织而成。最后一个block group的next block group为空。前面的磁盘中block group依次指向后续的block group,这些block group组成的逻辑空间是连续的。Refer to Figure 7. Deploying the file system on multiple disks is similar to the example of a single disk, and the file system is organized by multiple block groups. The next block group of the last block group is empty. The block groups in the previous disk point to the subsequent block groups in turn, and the logical space formed by these block groups is continuous.
参照图8。当需要动态向该系统中增加存储设备时,首先在新的磁盘上分配blockgroup,最后一个block group为dummy block group。读取旧的dummy block group信息,根据原有虚拟空间减去新增可用空间得到新的dummy block group的信息。找到原来文件系统的倒数第二块block group,将其next指针指向新磁盘的第一个block group。根据增加的block group数量修改超级块中的相关信息。根据可用空间、已用空间以及预警阈值,调整alert block group的位置。清除旧的alert block group的alert flag,并根据计算所得结果,找到新的阈值所在block group,并设置其alert flag。Refer to Figure 8. When it is necessary to dynamically add storage devices to the system, blockgroups are allocated on the new disk first, and the last block group is a dummy block group. Read the old dummy block group information, and subtract the new available space from the original virtual space to get the new dummy block group information. Find the penultimate block group of the original file system, and point its next pointer to the first block group of the new disk. Modify the relevant information in the super block according to the increased number of block groups. Adjust the position of the alert block group according to the available space, used space, and warning threshold. Clear the alert flag of the old alert block group, and according to the calculated result, find the block group where the new threshold is located, and set its alert flag.
参照图9。在向系统中增加存储设备完成后,需要向该系统中增加文件系统大小。增加文件系统大小仅增加虚拟空间的大小,所以只需修改dummy block group的相关信息,增加虚拟block数量即可。Refer to Figure 9. After adding storage devices to the system, the file system size needs to be added to the system. Increasing the size of the file system only increases the size of the virtual space, so you only need to modify the relevant information of the dummy block group and increase the number of virtual blocks.
参照图10。在文件系统向使用者提供服务前,需要设定预警用的alert blockgroup,如预警阈值为70%,则根据可用空间的大小找到其70%的地方,并设置对应的blockgroup的alert flag。Refer to Figure 10. Before the file system provides services to users, it is necessary to set an alert blockgroup for alerting. If the alerting threshold is 70%, find 70% of the space according to the available space, and set the alert flag of the corresponding blockgroup.
参照图11。在使用者扩展文件或写文件时,需要分配空间时,首先定位文件inode所在的block group,判断其上是否有足够的空闲块。如果有空闲块,判断是否是普通的block group(不是alert group)。如果是普通的block group,则直接分配空间写入数据。如果是alert block group,说明可用空间数量不足,需要发送alert请求或自动增加设备随后再分配block进行读写。如果当前block group没有足够的空间,则移到下一个blockgroup。判断该group是否是dummy group。如果不是判断该group是否有足够的空闲块。如果是dummy group说明目前没有可用的空间,需要发送alert或自动增加设备。Refer to Figure 11. When the user expands or writes a file and needs to allocate space, first locate the block group where the file inode is located, and judge whether there are enough free blocks on it. If there is a free block, judge whether it is a normal block group (not an alert group). If it is an ordinary block group, directly allocate space to write data. If it is an alert block group, it means that the amount of available space is insufficient, and you need to send an alert request or automatically increase the device and then allocate blocks for reading and writing. If the current block group does not have enough space, move to the next blockgroup. Determine whether the group is a dummy group. If not, judge whether the group has enough free blocks. If it is a dummy group, it means that there is currently no available space, and an alert needs to be sent or the device will be automatically added.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件来实现。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储器中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(个人计算机,服务器,网络设备等)执行本发明各个实施例或者实施例某些部分所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that the present invention can be realized by means of software plus a necessary general-purpose hardware platform, and of course it can also be realized by hardware. Based on such an understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in memory, such as ROM/RAM, disk, A CD, etc., includes several instructions to make a computer device (personal computer, server, network device, etc.) execute the methods described in various embodiments or some parts of the embodiments of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510205674.XACN104820575B (en) | 2015-04-27 | 2015-04-27 | Realize the method that storage system is simplified automatically |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510205674.XACN104820575B (en) | 2015-04-27 | 2015-04-27 | Realize the method that storage system is simplified automatically |
| Publication Number | Publication Date |
|---|---|
| CN104820575A CN104820575A (en) | 2015-08-05 |
| CN104820575Btrue CN104820575B (en) | 2017-08-15 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201510205674.XAActiveCN104820575B (en) | 2015-04-27 | 2015-04-27 | Realize the method that storage system is simplified automatically |
| Country | Link |
|---|---|
| CN (1) | CN104820575B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105468692A (en)* | 2015-11-17 | 2016-04-06 | 盛趣信息技术(上海)有限公司 | File system structure as well as packaging method and reading method thereof |
| US10216439B2 (en) | 2016-02-02 | 2019-02-26 | International Business Machines Corporation | Protecting unallocated data within a storage volume |
| CN106202350A (en)* | 2016-07-05 | 2016-12-07 | 浪潮(北京)电子信息产业有限公司 | A kind of distributed file system simplifies the method and system of configuration automatically |
| US10372363B2 (en)* | 2017-09-14 | 2019-08-06 | International Business Machines Corporation | Thin provisioning using cloud based ranks |
| US10721304B2 (en) | 2017-09-14 | 2020-07-21 | International Business Machines Corporation | Storage system using cloud storage as a rank |
| US10581969B2 (en) | 2017-09-14 | 2020-03-03 | International Business Machines Corporation | Storage system using cloud based ranks as replica storage |
| US10372371B2 (en) | 2017-09-14 | 2019-08-06 | International Business Machines Corporation | Dynamic data relocation using cloud based ranks |
| CN108762678B (en)* | 2018-05-30 | 2021-11-09 | 郑州云海信息技术有限公司 | Storage space recovery method, system, device and readable storage medium |
| CN109656874B (en)* | 2018-11-28 | 2024-03-08 | 山东蓝洋智能科技有限公司 | Method for implementing file management system in dual system |
| CN110351532B (en)* | 2019-08-08 | 2021-08-10 | 杭州阿启视科技有限公司 | Video big data cloud platform cloud storage service method |
| CN112783804B (en)* | 2019-11-08 | 2024-10-18 | 华为技术有限公司 | Data access method, device and storage medium |
| CN113126889A (en)* | 2020-01-15 | 2021-07-16 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for managing storage space |
| CN113467698A (en)* | 2020-03-30 | 2021-10-01 | 珠海全志科技股份有限公司 | Writing method and device based on file system, computer equipment and storage medium |
| CN114063907A (en)* | 2021-10-20 | 2022-02-18 | 郑州云海信息技术有限公司 | A storage space allocation method, system, storage medium and device |
| CN114020212B (en)* | 2021-10-25 | 2023-06-09 | 苏州浪潮智能科技有限公司 | Hard disk adaptation method, device and electronic equipment |
| CN117194369A (en)* | 2023-08-16 | 2023-12-08 | 西北工业大学 | Airborne embedded data reading and writing method and application |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102968279A (en)* | 2012-11-13 | 2013-03-13 | 浪潮电子信息产业股份有限公司 | Thin provisioning method for storage system |
| CN103116475A (en)* | 2013-02-06 | 2013-05-22 | 浪潮电子信息产业股份有限公司 | Method of automatic simplifying allocation expansion |
| CN103197981A (en)* | 2013-01-21 | 2013-07-10 | 浪潮(北京)电子信息产业有限公司 | Prewarning method and system for memory space |
| CN104317742A (en)* | 2014-11-17 | 2015-01-28 | 浪潮电子信息产业股份有限公司 | Thin provisioning method for optimizing space management |
| CN104331478A (en)* | 2014-11-05 | 2015-02-04 | 浪潮电子信息产业股份有限公司 | Data consistency management method for self-compaction storage system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2012003580A (en)* | 2010-06-18 | 2012-01-05 | Canon Inc | Information processing apparatus, control method of information processing apparatus and computer program |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102968279A (en)* | 2012-11-13 | 2013-03-13 | 浪潮电子信息产业股份有限公司 | Thin provisioning method for storage system |
| CN103197981A (en)* | 2013-01-21 | 2013-07-10 | 浪潮(北京)电子信息产业有限公司 | Prewarning method and system for memory space |
| CN103116475A (en)* | 2013-02-06 | 2013-05-22 | 浪潮电子信息产业股份有限公司 | Method of automatic simplifying allocation expansion |
| CN104331478A (en)* | 2014-11-05 | 2015-02-04 | 浪潮电子信息产业股份有限公司 | Data consistency management method for self-compaction storage system |
| CN104317742A (en)* | 2014-11-17 | 2015-01-28 | 浪潮电子信息产业股份有限公司 | Thin provisioning method for optimizing space management |
| Publication number | Publication date |
|---|---|
| CN104820575A (en) | 2015-08-05 |
| Publication | Publication Date | Title |
|---|---|---|
| CN104820575B (en) | Realize the method that storage system is simplified automatically | |
| US9448728B2 (en) | Consistent unmapping of application data in presence of concurrent, unquiesced writers and readers | |
| US8874859B2 (en) | Guest file system introspection and defragmentable virtual disk format for space efficiency | |
| US8533420B2 (en) | Thin provisioned space allocation | |
| CN102255962B (en) | Distributive storage method, device and system | |
| CN105589812B (en) | Disk fragments method for sorting, device and host | |
| US8407445B1 (en) | Systems, methods, and computer readable media for triggering and coordinating pool storage reclamation | |
| CN102968279B (en) | A kind of store the method that system simplifies configuration automatically | |
| US8527720B2 (en) | Methods of capturing and naming dynamic storage tiering configurations to support data pre-staging | |
| US9256382B2 (en) | Interface for management of data movement in a thin provisioned storage system | |
| US20060047926A1 (en) | Managing multiple snapshot copies of data | |
| CN102859499A (en) | Computer system and storage control method of same | |
| US20090300315A1 (en) | Reserve Pool Management in Virtualized Storage Systems | |
| EP2836900B1 (en) | Creating encrypted storage volumes | |
| US9348819B1 (en) | Method and system for file data management in virtual environment | |
| US10409776B1 (en) | Space-efficient persistent block reservation | |
| CN107544834A (en) | A kind of image file contraction method, device and machinable medium | |
| US9170740B2 (en) | System and method for providing implicit unmaps in thinly provisioned virtual tape library systems | |
| WO2020024933A1 (en) | Data writing method and server | |
| CN103744622B (en) | It is a kind of to realize the asynchronous method fully distributed of the automatic simplify configuration of storage system | |
| CN113986117A (en) | File storage method, system, computing device and storage medium | |
| US10235089B2 (en) | Storage control device, method and storage system to backup data using allocation information | |
| US11513702B2 (en) | Placement of metadata on data storage drives in a first storage enclosure of a data storage system | |
| Burger et al. | Accelerate with IBM storage: DS8880/DS8880f thin provisioning | |
| CN102063326B (en) | System for testing file system capacity based on virtualization and method thereof |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| EXSB | Decision made by sipo to initiate substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| TR01 | Transfer of patent right | ||
| TR01 | Transfer of patent right | Effective date of registration:20200707 Address after:Room 402, 4 / F, innovation and technology building, northwest Polytechnic University, 127 Youyi West Road, Beilin District, Xi'an City, Shaanxi Province 710072 Patentee after:XI'AN GUADA NETWORK TECHNOLOGY Co.,Ltd. Address before:710072 Xi'an friendship West Road, Shaanxi, No. 127 Patentee before:Northwestern Polytechnical University | |
| TR01 | Transfer of patent right | ||
| TR01 | Transfer of patent right | Effective date of registration:20201119 Address after:2041-2, building 1, Beihang science and Technology Park, 588 Feitian Road, Xi'an City, Shaanxi Province Patentee after:Xi'an kuanqin Standardization Technology Service Co., Ltd Address before:Room 402, 4 / F, innovation and technology building, northwest Polytechnic University, 127 Youyi West Road, Beilin District, Xi'an City, Shaanxi Province 710072 Patentee before:XI'AN GUADA NETWORK TECHNOLOGY Co.,Ltd. |