Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a data caching device, which comprises N-level caches, wherein N is an integer larger than 1, and the access speed of the first-level cache in the N-level caches is higher than that of the other caches in each level. The first-level buffer is used for storing first-type data, and each level of buffer except the first-level buffer in the N-level buffer is used for storing the first-type data and the second-type data.
Here, the access speed of the first-level buffer is higher than that of the remaining layers of buffers, and the first-level buffer has the highest access speed among the N-level buffers and can be used as a cache or a cache.
In addition, since the access speed of the first-level buffer is the highest, the storage capacity of the first-level buffer is generally smaller than that of the other buffers in each level, and the capacity expansion cost of the first-level buffer is higher than that of the other buffers in each level. In view of this, in the embodiment of the present invention, the first-level buffer may be used to store the first type of data to ensure the rapidity of the first type of data in buffering, and the high real-time performance and high availability in reading, and the other-level buffers may be used to store the first type of data and the second type of data to ensure that the types of data are stored more completely, so as to reduce the risk of data loss and increase the security and reliability of the data.
In the embodiment of the present invention, the data to be buffered may be divided into the first type data and the second type data, where the first type data needs to be buffered in the first-level buffer, and therefore, the first type data may be understood as data with higher importance, or data with higher specialization, or data with newer creation time, and so on. In contrast, the second type of data may be understood as less important data, or less specialized data, or more time-consuming data, and so on. The classification criteria for the first type of data and the second type of data are not limited in the embodiments of the present invention. The first type data and the second type data may be understood as full data or all data, and thus, each level buffer except the first level buffer in the N-level buffer may be understood as a buffer for buffering the full data.
Generally, the first type of data grows more slowly and the second type of data grows more rapidly, and the first type of data is generally less quantitatively comparable to the second type of data. Accordingly, the required buffer capacity for the first type of data is generally smaller in capacity than the required buffer capacity for the second type of data. Therefore, the first type data is buffered in the first-level buffer, the capacity expansion requirement of the first-level buffer can be reduced, and as the data volume increases, the capacity expansion of other levels of buffers is generally only needed, so that compared with the capacity expansion of the first-level buffer, the capacity expansion cost is reduced.
With the rapid development of internet technology, data can be roughly classified into two types, one type is PPC (professional Generated Content) type data, and the other type is UGC (User Generated Content) type data. Taking video data as an example, the video data can be roughly divided into PPC video data and UGC video data, and considering that the number of the UGC video data is in a massive growth trend, and the UGC video data is less than the PPC video data in importance and specialty, the PPC video data can be used as the first type data, and the UGC video data can be used as the second type data. When buffering, only the PPC video data may be buffered in the first-level buffer, and the full-scale video data (i.e., the PPC video data and the UGC video data) may be buffered in the other-level buffers, respectively.
Still taking the video data as an example, the current latest and hottest video data may be used as the first type data, for example, 500 ten thousand pieces of latest and hottest video data, and other video data may be used as the second type data. During buffering, only the latest and hottest video data at present may be buffered in the first-level buffer, and the full amount of video data may be buffered in the other level buffers, respectively. This data sorting method is similar to the LRU (Least Recently Used) caching method, which removes the Least Recently Used data and gives up to the most Recently read data. The data is often read most frequently and the number of times of reading is the largest, so that the caching performance of the system can be improved by using the LRU caching method.
Optionally, the reading sequence of the N-level buffer during data reading is to read data in the first-level buffer, and if data is not read in the first-level buffer, read data in other level buffers. Therefore, when the data exists in the first-level buffer, the data required by the user can be read at the fastest speed, and the data reading speed is improved.
Optionally, in the N-level buffers, the access speeds of the buffers of the respective levels are sequentially reduced, that is, the access speed of the first-level buffer is highest, and the access speeds of the buffers of the subsequent levels are sequentially reduced.
Furthermore, in the N-level buffers, the buffer capacities of the buffers of the respective levels are sequentially increased, that is, the buffer capacity of the first-level buffer is the smallest, and the buffer capacities of the subsequent buffers of the respective levels are sequentially increased.
Optionally, the reading sequence of the N-level buffer during data reading is to read data in the first-level buffer, and if data is not read in the first-level buffer, continue to penetrate to the next-level buffer to read data, and so on. Therefore, the reading sequence of the N-level buffer is matched with the access speed of the N-level buffer, and data required by a user can be read at the fastest speed during data reading, so that the data reading speed is improved.
Optionally, N is equal to 3, and the N-level buffers include the first-level buffer, the second-level buffer, and the third-level buffer;
the access speeds of the first-level buffer, the second-level buffer and the third-level buffer are reduced in sequence.
Further, the buffer capacities of the first-level buffer, the second-level buffer and the third-level buffer are sequentially increased.
The embodiment provides a three-level buffer architecture, in which a first type of data can be buffered in three levels of buffers simultaneously, and a second type of data can be buffered in two levels of buffers simultaneously, which meets the requirements of both the buffer cost and the data completeness maintenance.
For example, as shown in fig. 1, the first level buffer may be a distributed Redis (REmote DIctionary Server) or a Redis cluster, and the distributed Redis cluster may be regarded as a cache of a CPU and used for caching data meeting a preset condition. The second-level buffer can be a Couchbase distributed buffer, the stability and the real-time performance of Couchbase can meet requirements, the storage capacity is greatly improved compared with Redis, the subsequent capacity expansion is more convenient compared with Redis, and the Couchbase distributed buffer can be used for buffering full data. The third-level buffer can be a low-cost HiKV distributed buffer, the HiKV is similar to Hadoop, the data block of the third-level buffer has three backups, the third-level buffer is guaranteed in safety, the third-level buffer is not inferior to the Hadoop in data real-time performance, and the third-level buffer can be used as a data source during data penetration and a permanent buffer to buffer the full data.
Among other things, Redis stores data in a dictionary structure (key-value structure), is an open-source high-level storage and data structure storage system that can be used as a database, cache, and message queue broker. It supports data types such as strings, hash tables, lists, collections, ordered collections, bitmaps, Hyperlogs, etc.
Couchbase is a merger of CouchDB (open source document oriented database management system) and MemBase, which is a high performance, high scalability and highly available distributed caching system.
The HiKV is a set of distributed KV (key-value) data storage solution, is mainly used for solving the storage and high-performance read-write access of mass KV data, and provides various data consistency models and multi-data center support.
The embodiment of the invention also provides a data caching method, which is applied to a data caching device with N-level caches, wherein in the N-level caches, the access speed of the first-level cache is higher than that of the rest of the caches in each level.
As shown in fig. 2, the data caching method includes the following steps:
step 101: data is acquired.
The data acquired in this step is data that needs to be buffered, and the type of the data may be unlimited, for example, document data, picture data, audio data, video data, and the like.
In the embodiment of the invention, the type of the acquired data can be judged, and the data is cached in a grading way according to different types of data. If the data is the first type data, go tostep 1021; if the data is the second type of data,step 1022 is executed.
Step 1021: and respectively buffering the data in each level buffer of the N-level buffers.
In this step, when the data is the first type of data, the data is buffered in the buffers of each level, so that on one hand, the data has a plurality of (including two) backups, and the security of the data can be improved. On the other hand, since the data is buffered in the first-level buffer, since the access speed of the first-level buffer is higher than that of the remaining-level buffers, the speed of the data at the time of buffering, and high real-time performance and high availability at the time of reading can be ensured.
Step 1022: and respectively buffering the data in each layer of buffer except the first layer of buffer in the N layers of buffers.
In this step, when the data is the second type of data, the data is buffered in the buffers of each level, so that on one hand, the data has a plurality of (including two) backups, and the security of the data can be improved. On the other hand, when the data volume is increased, the capacity of other layers of buffers can be expanded, and compared with the capacity expansion of the first layer of buffers, the capacity expansion cost is reduced.
In the embodiment of the invention, the data is cached in a grading way by adopting the multi-level cache, only the first type data is cached in the first level cache with the highest access speed, and all types of data are stored in other levels of cache, so that on one hand, the first type data can be cached in the first level cache to ensure the rapidity of the first type data in caching and the high real-time performance and high availability in reading, and on the other hand, all types of data are completely stored because all types of data are cached in other levels of cache, the risk of data loss is reduced, and the safety and the reliability of the data are increased. In addition, when the data volume is increased, the capacity of other levels of buffers can be expanded to meet the requirement of rapidly increasing data buffers, and compared with the capacity expansion of the first level of buffers, the capacity expansion cost is reduced.
Optionally, in the N-level buffers, the access speeds of the buffers of the respective levels are sequentially reduced, that is, the access speed of the first-level buffer is highest, and the access speeds of the buffers of the subsequent levels are sequentially reduced.
Furthermore, in the N-level buffers, the buffer capacities of the buffers of the respective levels are sequentially increased, that is, the buffer capacity of the first-level buffer is the smallest, and the buffer capacities of the subsequent buffers of the respective levels are sequentially increased.
Optionally, N is equal to 3, and the N-level buffers include the first-level buffer, the second-level buffer, and the third-level buffer;
the access speeds of the first-level buffer, the second-level buffer and the third-level buffer are reduced in sequence.
Further, the buffer capacities of the first-level buffer, the second-level buffer and the third-level buffer are sequentially increased.
The embodiment provides a three-level buffer architecture, in which a first type of data can be buffered in three levels of buffers simultaneously, and a second type of data can be buffered in two levels of buffers simultaneously, which meets the requirements of both the buffer cost and the data completeness maintenance.
Optionally, the method further includes:
deleting first data from an ith-level buffer when the first data exists in the ith-level buffer;
the i is an integer greater than or equal to 1 and less than N, and the first data is data which is read in a preset period and has a frequency lower than a preset frequency.
When i is equal to 1, since the access speed of the first-level buffer is fastest, the buffer capacity is correspondingly smaller, and the capacity expansion cost is correspondingly higher, when data with lower activity exists in the data cached in the first-level buffer, in order to improve the utilization rate of the first-level buffer and reduce the cache cost, the data can be deleted from the first-level buffer. Since the other hierarchical buffers cache the data, even if the data is deleted from the first hierarchical buffer, the data is cached in the subsequent hierarchical buffer, and thus, the data can be safely and reliably stored.
When i is not equal to 1, although the storage capacity of the ith-level buffer is large and the capacity expansion is convenient, a large amount of long tail data with low activity may exist in the whole data, and it is not necessary to buffer all the data in the ith-level buffer. In view of this, the inactive long tail data (i.e., the first data) in the i-th level buffer can be deleted, so as to release the buffer capacity of the i-th level buffer as much as possible, improve the utilization rate of the i-th level buffer, and reduce the buffer cost. Even if the data is deleted from the ith-level buffer, the data is still buffered in the buffer of the subsequent level, so that the data can be safely and reliably stored.
Therefore, in the N-level buffer, the N-th level buffer is used as a permanent buffer for buffering the whole data, so as to provide a comprehensive guarantee for data backup, and ensure that all data can be safely and reliably stored. Therefore, the nth-level buffer can be the buffer with the largest buffer capacity and the lowest expansion cost so as to meet the buffer demand of data growth to the greatest extent.
Taking the architecture of the three-level buffer using Redis + Couchbase + HiKV shown in fig. 1 as an example, a Redis cluster is used as a first-level buffer, the stability and real-time performance of the Couchbase can meet the requirements, the buffer capacity is greatly improved compared with the Redis, and the subsequent capacity expansion is more convenient than the Redis, so the Couchbase is selected as a second-level buffer below the Redis cluster. Although Couchbase can be expanded conveniently, considering a large amount of data with inactive long tail, it is not necessary to buffer all data in the Couchbase, and therefore, a low-cost HiKV distributed buffer is continuously introduced below the Couchbase. When the Couchbase has inactive data, the data is deleted from the Couchbase in time, so that the data is permanently cached in HiKV.
Optionally, the method further includes:
receiving a data reading instruction;
reading target data from a first-level buffer, wherein the target data is data required to be read by the data reading instruction;
and if the target data is not read, reading the target data from the next-level buffer until the target data is read.
In order to better understand the data reading process, in the embodiment of the present invention, a video playing client (e.g., an aviary client) reads video data, and in this step, the client may receive a data reading instruction input by a user. The client may call a Cache access device (Cache access) Application Programming Interface (API) to obtain video data to be read by the data reading instruction. Specifically, the client may invoke the cache accessor to read the data (i.e., the target data) required to be read by the data reading instruction from the first-level cache, if the target data is not read in the first-level cache, the target data is continuously read by penetrating through the next-level cache, if the target data is not read yet, the target data is continuously read by penetrating through the next-level cache, and so on until the target data is read, or all the level caches have been read.
In the embodiment of the invention, the data is cached in a grading way by adopting the multi-level buffers, and the data is read by penetrating the multi-level buffers layer by layer according to the ascending order of the grades, so that the data reading speed can be improved, and the real-time property during the data reading is ensured.
Among the data buffered in the N-level buffer, the read data has higher liveness than the unread data. In view of this, in order to satisfy the advantage of the read data in the read speed, the read data may be dynamically loaded according to the data read condition, so that the read data is buffered in the buffer with higher access speed. When the data is loaded, the data is still cached in the original cache, so that the data is cached in the plurality of caches, and the safety of the data is improved. Specific embodiments of data loading are described below.
The first scheme is as follows: and if the target data is read from the k-level buffer, loading the target data to a k-1-level buffer, wherein k is an integer which is more than 2 and less than or equal to N.
Scheme II: and if the target data is read from the j-level buffer and the target data is the first type data meeting the preset condition, loading the target data into the first-level buffer, wherein j is an integer which is more than 1 and less than or equal to N.
In this embodiment, it is considered that the other hierarchical buffers except the first hierarchical buffer have a large storage capacity and a low expansion cost, and therefore, when the target data is read from the k-th hierarchical buffer, the target data can be unconditionally loaded to the k-1-th hierarchical buffer regardless of whether the target data is the first type of data or the second type of data.
In this embodiment, considering that the storage capacity of the first-level buffer is the smallest and the expansion cost is high, when the first type of data is read from the j-th-level buffer, whether to load the data into the first-level buffer may be determined according to whether the data satisfies a predetermined condition. Here, the preset condition may be understood as a loading condition, and the preset condition or the loading condition may include at least one of: the frequency of reading the data in the j-level buffer is higher than the preset frequency; the number of times the data is read in the j-th level buffer exceeds a predetermined number of times. Further, whether to load the first type data to the first level buffer may be determined according to an LRU algorithm.
Taking the architecture of the three-level buffer using Redis + Couchbase + HiKV shown in fig. 1 as an example, the data reading and data loading processes are specifically exemplified.
As shown in fig. 1 and fig. 3, the Cache access is used as a Cache reading module of the client, and when receiving a data reading instruction, the client may call the Cache access to read corresponding data layer by layer according to the sequence of redis, Couchbase, and HiKV. Specifically, the Cache access tries to read corresponding data from the Redis at the first time, and if the corresponding data is not read, the corresponding data penetrates to the Couchbase layer. If corresponding data is read at the Couchbase layer and the read data type is PPC, whether the data needs to be loaded into the upper layer Redis can be judged according to a corresponding LRU algorithm. If the corresponding data is still unread at the Couchbase layer, then the penetration continues down into the permanently stored HiKV. If data is read in HiKV, it may be loaded into the second tier Couchbase and a 2 month expiration period may be set. If the read data type is PPC, whether the data is loaded into the upper layer Redis may be considered according to the corresponding LRU algorithm.
By combining the above embodiments, the embodiments of the present invention have the following beneficial effects: firstly, the rapidity of the first type data in caching and the high real-time performance and high availability in reading can be ensured; secondly, various types of data can be completely stored, the risk of data loss is reduced, and the safety and reliability of the data are improved; thirdly, the read data can be dynamically loaded, so that the read data is cached in a cache with higher access speed; and fourthly, when the data volume is increased, the capacity of other layers of caches can be expanded to meet the requirement of the rapidly-increased data cache, and compared with the capacity expansion of the first layer of cache, the capacity expansion cost is reduced.
As shown in fig. 4, an embodiment of the invention provides adata caching apparatus 500, where thedata caching apparatus 500 has N-level caches, where N is an integer greater than 1; thedata caching apparatus 500 includes:
an obtainingmodule 501, configured to obtain data;
acaching module 502, configured to cache the data in each level of the N-level caches respectively if the data is first type data; if the data is the second type of data, caching the data in each level of cache except the first level of cache in the N-level cache respectively;
the access speed of the first-level buffer is higher than that of the rest of the buffers in each level.
Optionally, as shown in fig. 5, thedata caching apparatus 500 further includes:
a deletingmodule 503, configured to delete the first data from the ith-level buffer when the first data exists in the ith-level buffer;
the i is an integer greater than or equal to 1 and less than N, and the first data is data which is read in a preset period and has a frequency lower than a preset frequency.
Optionally, as shown in fig. 6, thedata caching apparatus 500 further includes:
areceiving module 504, configured to receive a data reading instruction;
areading module 505, configured to read target data from the first-level buffer, where the target data is data that needs to be read by the data reading instruction; and if the target data is not read, reading the target data from the next-level buffer until the target data is read.
Optionally, as shown in fig. 7, thedata caching apparatus 500 further includes:
afirst loading module 506, configured to load the target data into a k-1 th-level buffer if the target data is read from a k-level buffer, where k is an integer greater than 2 and less than or equal to N.
Optionally, as shown in fig. 8, thedata caching apparatus 500 further includes:
asecond loading module 507, configured to load the target data into the first-level buffer if the target data is read from the j-th-level buffer and the target data is the first-type data meeting a preset condition, where j is an integer greater than 1 and less than or equal to N.
Optionally, the preset condition includes at least one of:
the frequency of reading the target data in the j-level buffer is higher than a preset frequency;
the target data is read in the j-th level buffer more than a predetermined number of times.
Optionally, N is equal to 3, and the N-level buffers include the first-level buffer, the second-level buffer, and the third-level buffer;
the access speeds of the first-level buffer, the second-level buffer and the third-level buffer are reduced in sequence.
It should be noted that any implementation manner in the data caching method embodiment may be implemented by thedata caching apparatus 500 in this embodiment, and the same beneficial effects are achieved, and for avoiding repetition, details are not described here again.
As shown in fig. 9, anelectronic device 800 according to an embodiment of the present invention is further provided, where theelectronic device 800 includes amemory 801, aprocessor 802, and a computer program stored in thememory 801 and executable on theprocessor 802; theprocessor 802 may be communicatively coupled to a data caching apparatus having N-level caches, where N is an integer greater than 1; when theprocessor 802 executes the computer program, the following steps are implemented:
acquiring data;
if the data is the first type data, caching the data in each level of cache of the N-level cache respectively;
if the data is the second type of data, caching the data in each level of cache except the first level of cache in the N-level cache respectively;
the access speed of the first-level buffer is higher than that of the rest of the buffers in each level.
In FIG. 9, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented byprocessor 802 and various circuits of memory represented bymemory 801 being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. Theprocessor 802 is responsible for managing the bus architecture and general processing, and thememory 801 may store data used by theprocessor 802 in executing instructions. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted mobile terminal, a wearable device, and the like.
Optionally, when theprocessor 802 executes the computer program, the following is further implemented:
deleting first data from an ith-level buffer when the first data exists in the ith-level buffer;
the i is an integer greater than or equal to 1 and less than N, and the first data is data which is read in a preset period and has a frequency lower than a preset frequency.
Optionally, when theprocessor 802 executes the computer program, the following is further implemented:
receiving a data reading instruction;
reading target data from a first-level buffer, wherein the target data is data required to be read by the data reading instruction;
and if the target data is not read, reading the target data from the next-level buffer until the target data is read.
Optionally, when theprocessor 802 executes the computer program, the following is further implemented:
and if the target data is read from the k-level buffer, loading the target data to a k-1-level buffer, wherein k is an integer which is more than 2 and less than or equal to N.
Optionally, when theprocessor 802 executes the computer program, the following is further implemented:
and if the target data is read from the j-level buffer and the target data is the first type data meeting the preset condition, loading the target data into the first-level buffer, wherein j is an integer which is more than 1 and less than or equal to N.
Optionally, the preset condition includes at least one of:
the frequency of reading the target data in the j-level buffer is higher than a preset frequency;
the target data is read in the j-th level buffer more than a predetermined number of times.
Optionally, N is equal to 3, and the N-level buffers include the first-level buffer, the second-level buffer, and the third-level buffer;
the access speeds of the first-level buffer, the second-level buffer and the third-level buffer are reduced in sequence.
It should be noted that any implementation manner in the data caching method embodiment may be implemented by theelectronic device 800 in this embodiment, and the same beneficial effects are achieved, and details are not described here.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the data caching method embodiment or implements each process of the data processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described herein again. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one type of logical function division, and other division manners may be available in actual implementation, for example, a plurality of units or components may be combined or integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the transceiving method according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.