Large capacity cache and data store and read, the method for Memory Allocation and recoveryTechnical field
The present invention relates to field of data storage, particularly relate to that a kind of large capacity cache and data store and read, the method for Memory Allocation and recovery.
Background technology
Disk and file system are one of piths of computing machine, are the places of preserving data.But the access speed of disk is far below the access speed of internal memory, when in the face of mass data, the access speed of disk becomes the bottleneck of process data.In order to improve the processing speed of data, file system saves as disk in utilizing and provides a set of caching mechanism.
At present, the buffer memory of computer file system based on file, and owing to there are data identical in a large number in identical file or between different files, so just have data identical in a large number in internal memory, waste a large amount of memory headrooms, and then cause the decline of memory source utilization factor.Once file system consumes a large amount of internal memory, the program in computing machine can be caused to run slowly, affect the overall performance of computing machine.
Summary of the invention
The object of the present invention is to provide a kind of large capacity cache and data to store and read, the method for Memory Allocation and recovery, thus solve the foregoing problems existed in prior art.
To achieve these goals, the technical solution used in the present invention is as follows:
A kind of large capacity cache, comprises,
Data memory module: for storing data with the form of data block
Data read module: for reading data with the form of data block.
Further, described large capacity cache also comprises,
Memory allocating module: for during without free memory, distribute the first memory block from memory pool, and described first memory block is divided into the second memory block of multiple fixed size.
Further, described large capacity cache also comprises,
Memory recycle module: for the renewal based on data access time and access times, carries out the recovery of internal memory.
The date storage method of above-mentioned large capacity cache, comprises the steps:
S1, receives data block and described data block characteristic of correspondence code;
S2, retrieves described data block by described condition code, if retrieve described data block, then abandons the described data block and described condition code that receive; If do not retrieve described data block, then perform step S3;
S3, preserves described data block and described condition code;
S4, sets up condition code concordance list, and described condition code concordance list comprises described condition code and the data block corresponding with it;
S5, upgrades access time and the access times of described data block.
Further, before step S1, also comprise step, calculate the condition code of described data block.
Wherein, the condition code of the described data block of described calculating, is specially, and calculates the MD5 value of described data block.
The method for reading data of above-mentioned large capacity cache, comprises the steps:
S1, receives data block characteristic of correspondence code;
S2, retrieves described data block by described condition code, if retrieve described data block, then performs step S3; If do not retrieve described data block, then abandon the described condition code received;
S3, reads described data block;
S4, preserves described condition code;
S5, sets up condition code concordance list, and described condition code concordance list comprises described condition code and the data block corresponding with it;
S6, upgrades access time and the access times of described data block.
The memory allocation method of above-mentioned large capacity cache, comprises the steps:
S1, checks free memory list, if do not have free memory, then performs step S2-S5; If available free internal memory, then perform step S4-S5;
S2, distributes the first memory block from memory pool, and described first memory block is divided into the second memory block of multiple fixed size;
S3, adds to described second memory block in described free memory list;
S4, obtains internal memory from described free memory list;
S5, in described free memory list, by described memory marker for use.
The method for recovering internal storage of above-mentioned large capacity cache, comprises the steps:
S1, starts timer;
S2, obtains access time and the access times of described data block;
S3, checks whether described data block belongs to non-hot spot data block, if described data block does not belong to non-hot spot data block, then checks next data block; If described data block belongs to non-hot spot data block, then perform step S4;
S4, removes described data block;
S5, checks that whether described first memory block is idle, if idle, then discharges described first memory block; If there is no the free time, then described second memory block is put into free memory list.
Further, after step S5, also comprise, after multiple described second memory block forms described first memory block, discharge described first memory block.
The invention has the beneficial effects as follows:
The process that the present invention is directed to mass data creates a kind of large capacity cache, and achieve this large capacity cache method, this large capacity cache and its implementation are based on data block, instead of based on file, thus avoid identical data and take many parts of internal memories, and when processing mass data, can ensure that identical data block only processes once in the buffer, and not having a large amount of repetitions.So technical scheme provided by the invention, had both achieved the access of file system to internal memory, decreased the access of file system to disk, improved the reading speed to mass data, also save memory headroom greatly, improve the utilization factor of memory source; And then the overall performance of the program operation speed that improve in computing machine and computing machine.
Accompanying drawing explanation
Fig. 1 is the organigram of the large capacity cache that the embodiment of the present invention provides;
Fig. 2 is the process flow diagram of the date storage method of the large capacity cache that the embodiment of the present invention provides;
Fig. 3 is the process flow diagram of the method for reading data of the large capacity cache that the embodiment of the present invention provides;
Fig. 4 is the process flow diagram of the memory allocation method of the large capacity cache that the embodiment of the present invention provides;
Fig. 5 is the process flow diagram of the method for recovering internal storage of the large capacity cache that the embodiment of the present invention provides.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with accompanying drawing, the present invention is further elaborated.Should be appreciated that embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
As shown in Figure 1, embodiments provide a kind of large capacity cache, can comprise,
Data memory module: for storing data with the form of data block
Data read module: for reading data with the form of data block.
In prior art, because the buffer memory of computer file system is based on file, there are data identical in a large number in identical file or between different files, so in operating process, a large amount of memory headrooms can be wasted, and then cause memory source utilization factor to decline, thus affect the overall performance of computing machine.The embodiment of the present invention is in order to solve the problem, the data cached employing of computer file system manages based on the form of data block, like this, just can ensure that identical data block only processes once in internal memory, and can not re-treatment, cause the waste of memory headroom and the decline of resource utilization, thus improve the overall performance of computing machine.
Compared with the read operation carrying out data in the form of a file of the prior art, the large capacity cache that the embodiment of the present invention provides, when in the face of mass data, especially when the data of a large amount of constantly change, it carries out the read operation of data with the form of data block, more obvious to the raising of the raising of internal memory resource utilization and entire system performance, act on more remarkable.
In an embodiment provided by the invention, large capacity cache can also comprise,
Memory allocating module: for during without free memory, distribute the first memory block from memory pool, and described first memory block is divided into the second memory block of multiple fixed size.
Adopt technique scheme, achieve the predistribution of internal memory, thus achieve the raising of data reading speed, and then improve the overall performance of system.
In an embodiment provided by the invention, large capacity cache can also comprise,
Memory recycle module: for the renewal based on data access time and access times, carries out the recovery of internal memory.
Adopt technique scheme, the internal memory that non-hot spot data takies can be reclaimed, thus save memory source, improve the utilization factor of internal memory.
As shown in Figure 2, embodiments provide the date storage method of above-mentioned large capacity cache, comprise the steps:
S1, receives data block and described data block characteristic of correspondence code;
S2, retrieves described data block by described condition code, if retrieve described data block, then abandons the described data block and described condition code that receive; If do not retrieve described data block, then perform step S3;
S3, preserves described data block and described condition code;
S4, sets up condition code concordance list, and described condition code concordance list comprises described condition code and the data block corresponding with it;
S5, upgrades access time and the access times of described data block.
Wherein, before step S1, can also step be comprised, calculate the condition code of described data block.Calculate the condition code of described data block, be specifically as follows, calculate the MD5 value of described data block.
Due to MD5 calculate time, whole file is used as a Long Binary information, by its irreversible character string mapping algorithm, produces a unique MD5 informative abstract, i.e. MD5 value.So for a file, only there is a MD5 value, if this file has carried out arbitrary variation, MD5 value all can change, therefore carries out MD5 calculating to data, and data markers is more accurate.In the embodiment of the present invention, using the MD5 value that the calculates condition code as data block, because the MD5 value of each data block is unique, so using MD5 value as the condition code of data block, can be more accurate, there will not be the situation of wrong identification.
In the embodiment of the present invention, by setting up data block characteristic of correspondence code concordance list, because condition code concordance list comprises condition code and the data block corresponding with it, so use condition code retrieves data blocks, fast and accurately, thus can realize in the buffer fast and data storage procedure accurately.When in the face of mass data, especially when the data of a large amount of constantly change, the lifting of data rate memory and the raising of accuracy rate will be more obvious, and then more obvious to the raising of entire system performance.
Simultaneously, due in the embodiment of the present invention, data storage is based on data block, can ensure that identical data only store once in the buffer, so there will not be same data to take the situation of polylith internal memory, the waste of memory source can not be caused, thus the utilization factor of internal memory and the overall performance of system can be improved.
As shown in Figure 3, embodiments provide the method for reading data of above-mentioned large capacity cache, comprise the steps:
S1, receives data block characteristic of correspondence code;
S2, retrieves described data block by described condition code, if retrieve described data block, then performs step S3; If do not retrieve described data block, then abandon the described condition code received;
S3, reads described data block;
S4, preserves described condition code;
S5, sets up condition code concordance list, and described condition code concordance list comprises described condition code and the data block corresponding with it;
S6, upgrades access time and the access times of described data block.
Wherein, before step S1, can also step be comprised, calculate the condition code of described data block.Calculate the condition code of described data block, be specifically as follows, calculate the MD5 value of described data block.
Due to MD5 calculate time, whole file is used as a Long Binary information, by its irreversible character string mapping algorithm, produces a unique MD5 informative abstract, i.e. MD5 value.So for a file, only there is a MD5 value, if this file has carried out arbitrary variation, MD5 value all can change, therefore carries out MD5 calculating to data, and data markers is more accurate.In the embodiment of the present invention, using the MD5 value that the calculates condition code as data block, because the MD5 value of each data block is unique, so using MD5 value as the condition code of data block, can be more accurate, there will not be the situation of wrong identification.
In the embodiment of the present invention, by setting up data block characteristic of correspondence code concordance list, because condition code concordance list comprises condition code and the data block corresponding with it, so use condition code retrieves data blocks, fast and accurately, thus can realize in the buffer fast and data read process accurately.When in the face of mass data, especially when the data of a large amount of constantly change, the lifting of data reading speed and the raising of accuracy rate will be more obvious, and then more obvious to the raising of entire system performance.
Simultaneously, due in the embodiment of the present invention, digital independent is based on data block, can ensure that identical data only read once in the buffer, so there will not be same data to take the situation of polylith internal memory, the waste of memory source can not be caused, thus the utilization factor of internal memory and the overall performance of system can be improved.
As shown in Figure 4, embodiments provide the memory allocation method of above-mentioned large capacity cache, comprise the steps:
S1, checks free memory list, if do not have free memory, then performs step S2-S5; If available free internal memory, then perform step S4-S5;
S2, distributes the first memory block from memory pool, and described first memory block is divided into the second memory block of multiple fixed size;
S3, adds to described second memory block in described free memory list;
S4, obtains internal memory from described free memory list;
S5, in described free memory list, by described memory marker for use.
In the embodiment of the present invention, during by there is no a free memory, the first memory block is distributed from memory pool, again described first memory block is divided into the method for the second memory block of multiple fixed size, achieve the predistribution of internal memory, like this, when data block needs storage allocation to store time, just can directly be stored in the second memory block of corresponding size, and do not need system reallocation internal memory, thus time of Memory Allocation when having saved digital independent, improve data reading speed, and then improve the overall performance of system.
As shown in Figure 5, embodiments provide the method for recovering internal storage of above-mentioned large capacity cache, comprise the steps:
S1, starts timer;
S2, obtains access time and the access times of described data block;
S3, checks whether described data block belongs to non-hot spot data block, if described data block does not belong to non-hot spot data block, then checks next data block; If described data block belongs to non-hot spot data block, then perform step S4;
S4, removes described data block;
S5, checks that whether described first memory block is idle, if idle, then discharges described first memory block; If there is no the free time, then described second memory block is put into free memory list.
Wherein, after step S5, also comprise, after multiple described second memory block forms described first memory block, discharge described first memory block.
For the second memory block deposited in free memory list, when multiple second memory block can form large first memory block, just this first large memory block can be discharged, realize the recovery of memory source.
In the embodiment of the present invention, by removing non-hot spot data block, discharging internal memory used, the recovery to internal memory shared by non-hot spot data can be realized, thus save memory headroom further.
Release due to the first memory block is by realizing the inspection of data access time and access times, along with the continuous change of data changes, so, the real-time that memory source reclaims can be realized.For a large amount of data, especially for the data of a large amount of constantly change, carry out the recovery of memory source in real time, to saving memory source, improve the utilization factor of internal memory, have great importance.
By adopting technique scheme disclosed by the invention, obtain effect useful as follows: both achieved the access of file system to internal memory, decrease the access of file system to disk, improve the reading speed to mass data, also save memory headroom greatly, improve the utilization factor of memory source; And then the overall performance of the program operation speed that improve in computing machine and computing machine.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar part mutually see.
Those skilled in the art it should be understood that the sequential of the method step that above-described embodiment provides can carry out accommodation according to actual conditions, also can carry out according to actual conditions are concurrent.
The hardware that all or part of step in the method that above-described embodiment relates to can carry out instruction relevant by program has come, described program can be stored in the storage medium that computer equipment can read, for performing all or part of step described in the various embodiments described above method.Described computer equipment, such as: personal computer, server, the network equipment, intelligent mobile terminal, intelligent home device, wearable intelligent equipment, vehicle intelligent equipment etc.; Described storage medium, such as: the storage of RAM, ROM, magnetic disc, tape, CD, flash memory, USB flash disk, portable hard drive, storage card, memory stick, the webserver, network cloud storage etc.
Finally, also it should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, commodity or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, commodity or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, commodity or the equipment comprising described key element and also there is other identical element.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should look protection scope of the present invention.