SYSTEMS AND METHODS FOR CACHE REPLACEMENT
BACKGROUND The present invention, in some embodiments thereof, relates to cache management and, more specifically, but not exclusively, to cache replacement policies.
Hardware and software applications may use a cache to improve performance. Page caching, for example, is a cache of disk pages kept by the operating system, stored in unused main memory. In another example, web caching is a temporary storage of web documents to increase performance. In cloud computing and cloud networking, open virtual switch (OVS) uses a caching mechanism in the OS kernel level (the data path) to process incoming traffic faster compared to the user level. In cloud networking, where compute nodes are deployed massively, inefficient local caching might cause bottlenecks in the network. The implemented caching policy therefore becomes crucial. Every cache has a limited capacity and therefore requires replacement of entries when the capacity of the cache is exceeded, otherwise the processing performance benefit provided by the cache is degraded.
The caching policy performance is measured by the miss rate, which is defined as the number of times a referenced entry is not found in the cache. The goal of a cache replacement policy is to improve performance by minimizing the miss rate.
Existing cache replacement policies, for example, least recently used (LRU), are based on implementation simplicity. As computation power becomes stronger and more easily available, more complex algorithms may be applied with less performance compromise.
The following is a list of the most common cache replacement policy methods: * Belady's Algorithm is based on evicting from the cache the entries which will be referenced in the farthest future. The implementation is theoretically true, but not practically feasible, since a perfect prediction of the incoming input is required. * First In First Out (FIFO), and Last In First Out (LIFO) are methods that are simple to implement. However, their performance is relatively poor in comparison with other existing cache replacement policies.
* Other exemplary existing more advanced and complex cache replacement policies include: Least-Recently Used (LRU), Most-Recently Used (MRU), Pseudo-LRU (PLRU),
Low Inter-reference Recency Set (LIRS), Random Replacement (RR), Segmented LRU (SLRU), Least-Frequently Used (LFU), LFU with Dynamic Aging (LFUDA), Not Frequently Used (NFU), Adaptive Replacement Cache (ARC) and Clock with Adaptive Replacement (CAR). It is noted that LRU is widely used, however the method underperforms more advanced and complex algorithms. The OVS, for instance uses LRU caching.
SUMMARY
It is an object of the present invention to provide an apparatus, a method, a computer program product, and a system for managing a cache storing cached entries.
The foregoing and other objects are achieved by the features of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.
According to a first aspect, an apparatus for managing a cache storing cached entries comprises: a processor configured to: receive snapshots of the cached entries, compute for each cached entry of the cached entries based on the snapshot of the respective cached entry, the following values: a usage score indicative of the average number of references of the respective cached entry per unit of time, and a span score indicative of the effective lifetime of the respective cached entry, and designate a subset of the cached entries for replacement according to the computed values of the usage score and span score.
According to a second aspect, a method for managing a cache storing cached entries, comprises: receiving snapshots of the cached entries, computing for each cached entry of the cached entries based on the snapshot of the respective cached entry, the following values: a usage score indicative of the average number of references of the respective cached entry per unit of time, and a span score indicative of the effective lifetime of the respective cached entry, and designating a subset of the cached entries for replacement according to the computed values of the usage score and span score.
The apparatus and/or method provide an improvement in performance of a cache replacement policy in comparison to existing cache replacement policies (e.g., least recently used (LRU)), by reducing the cache miss rate and/or increasing the cache hit rate. LRU-based replacement (e.g., implemented by OVS) does not minimize the miss rate of incoming packets (and/or other items), especially in high rate, high entropy scenarios. The usage score is used to replace cache entries with low usage while expecting new arbitrary cache entries, which increases the probability that higher usage entries will be stored in the cache. The span score is used to replace cache entries that have already been hit, but are unlikely to be further hit. Long-lived cache entries are prioritized for remaining in the cache entries.
The evaluation of the cache entries over a period of time using multiple aggregated snapshots, and the designation of the cache entries for replacement, occurs in parallel and/or independently of cache activity (i.e., reading and writing the cache entries), which improves performance of the cache by reducing or preventing interference with the cache activity. The parallel and/or simultaneous evaluation of the cache entries is in contrast to other existing cache replacement policies that perform cache replacement together (i.e., directly dependent on) with evaluation of the cache entries.
The cache entries are designated for replacement according to a combination based on time (i.e., span score) and on usage (i.e., usage score), which improves performance over replacement according to only time based factors or only usage based factors.
The apparatus and/or method are agnostic to the caching domain, and may be applied to different utilities, applications, and use cases that are implemented using a cache.
In a first possible implementation of the apparatus according to the first aspect or the method according to the second aspect, the processor is further configured to or the method further comprises: computing for each cached entry of the cached entries based on the snapshot of the respective cached entry, a dirtiness score indicative of whether the respective cached entry is being processed, and wherein the designation of the subset is further performed according to the computed value of the dirtiness score. The dirtiness score handles prescaling. Latency is reduced by postponing replacement of cache entries currently being processed to a future time when the cache entry is not being processed.
In a second possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the processor is further configured to or the method further comprises sorting the cached entries according to at least the computed values of the usage score, and the span score, and designate the subset of the cached entries according to the sorting based on a replacement score requirement. The scoring is designed for flexibility, and is not based on a specific underlying use, application, case, or scenario. The sorting of the scores may be performed by increasing value or decreasing value, depending on the implementation.
In a third possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the snapshots of the cached entries are obtained at a first predefined rate, and aggregation is performed using the snapshots obtained during a predefined sliding window.
The snapshots may be taken at a predefined rate that is independent of other events (e.g., reads/writes to the cache), which improves computation of the scores in real time. In a fourth possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the size of the subset of the cached items designated for replacement is according to a predefined score intensity that denotes a maximum percentage of cache entries designated for replacement. The replacements are performed according to the maximum percentage of cache entries that may be replaced, which improves performance of the cache in comparison to other methods that decide whether to replace each cache entry independently of other entries.
In a fifth possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the subset of the cached items designated for replacement are evicted from the cache at a second predefined rate.
The evictions may be performed at a predefined rate that is independent of other events (e.g., reads/writes to the cache), which improves the real time eviction of cache entries.
In a sixth possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the processor is further configured to or the method further comprises computing a score vector based on at least the computed values of the usage score, and span score, and wherein the subset of the cached entries is designated for replacement according to the computed score vector.
The score vector, which is based on the usage score and the span score, may improve computations, for example, in comparison to independently considering each of the usage score and span score.
In a seventh possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, each cache entry stores an indication of an item with references.
The apparatus improves the performance of a network using the cache entries to store flows of packets and/or (other items).
In an eighth possible implementation form of the apparatus or the method according to the seventh implementation form of the first or second aspect, at least some of the items with references include mice items of high rate and high entropy.
The apparatus improves performance of the cache for caching mice flows, in comparison to existing cache replacement policies that create bottlenecks, resulting in performance degradation. For example, incoming packets (and/or other items) may not be processed in adequate time, increasing the latency and/or the loss rate of the packets (and/or other items).
In a ninth possible implementation form of the apparatus or the method according to the seventh or eighth implementation forms of the first or second aspect, the values of the usage score is computed as the average number of references per item associated with each cache entry, and the values of the span score is computed as the effective lifetime of the item associated with each cache entry.
The average number of items (e.g., packets) per flow and/or the lifetime of the flow are used to differentiate between mice flows and elephant flows. Mice flows which include short lived flows with a small number of items (e.g., packets), lead to inefficiency of the cache, and are therefore designated for replacement.
In a tenth possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, each cache entry stores an indication of a switching flow item of an open virtual switch (OVS) implemented in at least one of a computing cloud and a cloud networking environment.
The apparatus improves performance of a cache used by an OVS deployed in cloud network computing hosts (e.g., in a cloud computing and/or cloud networking environment). The performance of each OVS in the network is critical to ensure the quality of service (QoS) of the network. The improvement in performance of the OVS cache by the apparatus is in comparison to existing OVS cache replacement policies that become bottlenecks, resulting in performance degradation. For example, in cases of port scans, P2P rendezvous servers, and/or network monitoring applications, the OVS cache management is not necessarily sufficient to treat all the incoming packets in adequate time, increasing the latency and/or the loss rate of the packets.
The performance of the OVS is improved by improving the performance of the cache, for example, in comparison to other attempts to improve the performance of the OVS by other approaches. For example, data plane development kit (DPDK), code optimizations, mega- flows, priority sorting, staged lookup, and longest prefix match, are examples of existing methods that were implemented in order to improve the OVS performance along different versions.
In an eleventh possible implementation form of the apparatus according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the apparatus further comprising a cache monitoring device that captures snapshots of each of the cached entries, or the apparatus further comprises capturing snapshots of each of the cached entries, and wherein the processor is further configured to or the method further comprises performing aggregation of the snapshots.
The cache monitoring device obtains snapshots of the cached entries and performs the aggregation without significantly affecting performance of the cache. Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced. In the drawings:
FIG. 1 is a flowchart of a method that designates cache entries for eviction based on a usage score and a span score, in accordance with some embodiments of the present invention;
FIG. 2 is a block diagram of components of a system that includes a computing device that manages cache entries stored in a cache used by a computing system, by designating cache entries for eviction based on a usage score and a span score, in accordance with some embodiments of the present invention;
FIG. 3 is a graph depicting computation of the usage score, in accordance with some embodiments of the present invention; FIG. 4 is a graph depicting computation of the span score, in accordance with some embodiments of the present invention;
FIG. 5 is a block diagram depicting components of a system based on the method described with reference to FIG. 1, and/or implemented by the system described with reference to FIG. 2, in accordance with some embodiments of the present invention;
FIG. 6 is a dataflow diagram depicting dataflow within a system based on the method described with reference to FIG. 1, and/or implemented by the system described with reference to FIG. 2, in accordance with some embodiments of the present invention;
FIG. 7 is a dataflow diagram depicting the process of designating cached entries for eviction according to the usage score and the span score, in accordance with some
embodiments of the present invention;
FIG. 8 is a schematic of a set-up of a test environment used to experimentally measure performance of a cache using the apparatus and/or systems and/or methods described herein, in accordance with some embodiments of the present invention;
FIG. 9 is a table summarizing the experimental results for the experiment performed using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention;
FIG. 10 is a graph depicting miss rate as a function of the number of targets per virtual machine for the experiment performed using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention; and
FIG. 11 is a graph depicting latency as a function of the number of targets per virtual machine for the experiment performed using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention.
DETAILED DESCRIPTION
The present invention, in some embodiments thereof, relates to cache management and, more specifically, but not exclusively, to cache replacement policies. An aspect of some embodiments of the present invention relates to an apparatus and/or a method (e.g., code instructions stored in a data storage device executed by one or more processors) that designate a subset of cached entries stored in a cache for replacement according to values of a usage score and a span score computed for each cached entry. The usage score is indicative of the average number of references of the respective cached entry per unit of time. The span score is indicative of the effective lifetime of the respective cached entry.
Optionally, the designation of the subset of cached entries for replacement is performed according to a dirtiness score indicative of whether the respective cached entry is being processed.
The apparatus and/or method provide an improvement in performance of a cache replacement policy in comparison to existing cache replacement policies (e.g., least recently used (LRU)), by reducing the cache miss rate and/or increasing the cache hit rate. LRU-based replacement (e.g., implemented by OVS) does not minimize the miss rate of incoming packets (and/or other items), especially in high rate, high entropy scenarios. The usage score is used to replace cache entries with low usage while expecting new arbitrary cache entries, which increases the probability that higher usage entries will be stored in the cache. The span score is used to replace cache entries that have already been hit, but are unlikely to be further hit. Long-lived cache entries are prioritized for remaining in the cache entries.
The evaluation of the cache entries over a period of time using multiple aggregated snapshots, and the designation of the cache entries for replacement, occurs in parallel and/or independently of cache activity (i.e., reading and writing the cache entries), which improves performance of the cache by reducing or preventing interference with the cache activity. The parallel and/or simultaneous evaluation of the cache entries is in contrast to other existing cache replacement policies that perform cache replacement together (i.e., directly dependent on) with evaluation of the cache entries.
The cache entries are designated for replacement according to a combination based on time (i.e., span score) and on usage (i.e., usage score), which improves performance over replacement according to only time based factors or only usage based factors.
The apparatus and/or method is agnostic to the caching domain, and may be applied to different utilities, applications, and use cases that are implemented using a cache. Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention. Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
The flowchart and block diagrams in the Figures illustrate the architecture,
functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
As used herein, the terms eviction, replacement, and deletion are used interchangeably with reference to cache entries that are deleted, evicted, and/or replaced with new cache entries.
Reference is now made to FIG. 1, which is a flowchart of a method that designates cache entries for eviction based on a usage score and a span score, in accordance with some embodiments of the present invention. Blocks 102-108 refer to the process of computing the usage score and span score. Blocks 110-114 refer to the process of designating cache entries for eviction based on the computed usage score and span score. Reference is also made to FIG. 2, which is a block diagram of components of a system 200 that includes a computing device 202 that manages cache entries stored in a cache 204 used by a computing system 206, by designating cache entries for eviction based on a usage score and a span score, in accordance with some embodiments of the present invention.
Computing system 206 may be implemented as, for example, one of more of: a computing cloud, a cloud network, a computer network, a virtual machine(s) (e.g., hypervisor, virtual server), a switch, an OVS, a virtual network, a router, a virtual router, a single computing device (e.g., client terminal), a group of computing devices arranged in parallel, a network server, a web server, a storage server, a local server, a remote server, a client terminal, a mobile device, a stationary device, a kiosk, a smartphone, a laptop, a tablet computer, a wearable computing device, a glasses computing device, a watch computing device, and a desktop computer.
Cache 204 is used by computing system 206 to improve performance by storing cache entries. Cache 204 may be integrated within computing system 206, and/or may be implemented as an external storage device. Cache 204 may be implemented as, for example, a random access memory (RAM), read-only memory (ROM), and/or a storage device, for example, non- volatile memory, magnetic media, semiconductor memory devices, hard drive, removable storage, and optical media (e.g., DVD, CD-ROM).
Cache 204 may be implemented as, for example, a single level cache, a multi-level cache, and/or one or more levels of a multi-level cache (e.g., a last level cache).
Computing device 202 may be implemented as, for example, integrated within cache 204 (e.g., as hardware and/or software installed within cache 204), integrated within computing system 206 (e.g., as hardware and/or software installed within computing system 206), and/or as an external component (e.g., implemented as hardware and/or software) in communication with cache 204.
Computing device 202 includes one or more processor(s) 208, implemented as for example, central processing unit(s) (CPU), graphics processing unit(s) (GPU), field programmable gate array(s) (FPGA), digital signal processor(s) (DSP), application specific integrated circuit(s) (ASIC), customized circuit(s), processors for interfacing with other units, and/or specialized hardware accelerators. Processor(s) 208 may be implemented as a single processor, a multi-core processor, and/or a cluster of processors arranged for parallel processing (which may include homogenous and/or heterogeneous processor architectures). It is noted that processor(s) 208 may be designed to implement in hardware one or more features stored as code instructions (e.g., described herein as stored in memory 210 and/or data storage 212).
Memory 210 stores code instructions implementable by processor(s) 208, for example, a random access memory (RAM), read-only memory (ROM), and/or a storage device, for example, non- volatile memory, magnetic media, semiconductor memory devices, hard drive, removable storage, and optical media (e.g., DVD, CD-ROM). Memory 206 may store code instructions that when executed by processor(s) 208, implement one or more acts of the method described with reference to FIG. 1.
Computing device 202 may include a data storage device 212 for storing data. Data storage device 212 may be implemented as, for example, a memory, a local hard-drive, a removable storage unit, an optical disk, a storage device, and/or as a remote server and/or computing cloud (e.g., accessed using a network connection). It is noted that code instructions executable by processor(s) 208 may be stored in data storage device 212, for example, with executing portions loaded into memory 210 for execution by processor(s) 208. Data storage device 212 may include one or more data repositories, for example, snapshots of cache entries obtained from cache 204.
Optionally, a cache monitoring device 214 monitors cache entries stored in cache 204. Cache monitoring device 214 acquires snapshots of the cache entries, for example, as memory dumps, and provides the snapshots to computing device 202. Cache monitoring device 214 may perform aggregation of the snapshots (as described herein), and/or provide the snapshots for aggregation by a different component (e.g., computing device 202). Cache monitoring device 214 may be implemented as software instructions, and/or as a hardware component, which may be an independent component, integrated within cache 204, and/or integrated within computing device 202, and/or integrated within computing system 206. Optionally, a cache interface 216 designates cache entries stored in cache 204 for eviction (selected by computing device 202 as described herein), for example, by marking the designated cache entries, such as by setting one or more bits of the cache indicative that the cache entry may be overwritten. Alternatively or additionally, cache interface 216 actively deletes the designated cache entries from cache 204. Cache interface 216 may be
implemented as software instructions, and/or as a hardware component, which may be an independent component, integrated within cache 204, and/or integrated within computing device 202, and/or integrated within computing system 206.
Computing device 202 may be in communication with a user interface 218 that presents data to a user and/or includes a mechanism for entry of data, for example, one or more of: a touch-screen, a display, a keyboard, a mouse, voice activated software, and a microphone. User interfaces 218 may be used to manually or automatically configure, for example, the association between the computed span score and usage score, and the cache entries designated for eviction. As used herein, the term automatically configure means an external optimizer and/or machine learning program that recommends configuration based on historical data, either dynamically or each setup, subject to the specific environment in which the code executes. At 102, snapshots of the cached entries stored in cache 204 are received by computing device 202 for processing by processor(s) 208.
The snapshots of the cached entries may be captured by cache monitoring device 214. The cache monitoring device obtains snapshots of the cached entries and performs the aggregation without significantly affecting performance of the cache. The snapshots of the cached entries are obtained at a predefined rate. The snapshots may be taken at a predefined rate that is independent of other events (e.g., reads/writes to the cache), which improves computation of the scores in real time.
Optionally, the snapshots are aggregated. The aggregations of the snapshots of the cached entries may be performed by processor(s) 208, by cache monitoring device 214, and/or by another computing device. The aggregation is performed using the snapshots obtained during a predefined sliding window. The aggregation may include processing of cache related data, grouped by flow id (ufid) into a data structure used to compute the usage score and/or span score and/or dirtiness score, each by its corresponding definition, together with the common flow's attributes. For example, obtaining the src/dst IP, src/dst MAC, src/dst port, action, packets, bytes, usage score, span score, dirtiness score during the predefined sliding window.
Each cache entry stores an indication of an item with references. Exemplary items with references include a flow of packets through a network. The apparatus and/or method improve the performance of the network using the cache entries to store flows of packets. At least some of the items with references may include mice items of high rate and high entropy, for example, flow of packets defined as mice flows of high rate and high entropy traffic. The apparatus and/or method improve performance of the cache for caching mice flows, in comparison to existing cache replacement policies that create bottlenecks, resulting in performance degradation. For example, incoming packets (and/or other items) may not be processed in adequate time, increasing the latency and/or the loss rate of the packets (and/or other items). In an exemplary implementation, each cache entry stores an indication of a switching flow item of an open virtual switch (OVS) 206 implemented in a computing cloud and/or a cloud networking environment. The apparatus and/or method improve performance of a cache used by an OVS deployed in cloud network computing hosts (e.g., in a cloud computing and/or cloud networking environment). The performance of each OVS in the network is critical to ensure the quality of service (QoS) of the network. The improvement in performance of the OVS cache by the apparatus is in comparison to existing OVS cache replacement policies that become bottlenecks, resulting in performance degradation. For example, in cases of port scans, P2P rendezvous servers, and/or network monitoring applications, the OVS cache management is not necessarily sufficient to treat all the incoming packets in adequate time, increasing the latency and/or the loss rate of the packets.
The performance of the OVS is improved by improving the performance of the cache, for example, in comparison to other attempts to improve the performance of the OVS by other approaches. For example, data plane development kit (DPDK), code optimizations, mega- flows, priority sorting, staged lookup, and longest prefix match, are examples of existing methods that were implemented in order to improve the OVS performance along different versions.
At 104, a usage score indicative of the average number of references of the respective cached entry per unit of time is computed. Each reference denotes a read access to the cached entry. The unit of time may correspond to the caching duration of the flow within the sliding window used to perform the aggregation of the snapshots.
The value of the usage score may be computed as the average number of references per item associated with each cache entry, for example, the average number of packets per flow. The value of the span score may be computed as the effective lifetime of the item associated with each cache entry, for example, the effective lifetime of the flow. The average number of references (e.g., packets) per flow and/or the lifetime of the flow are used to differentiate between mice flows and elephant flows. Mice flows which include short lived flows with a small number of references (e.g., packets), lead to inefficiency of the cache, and are therefore designated for replacement.
The scoring described herein is designed for flexibility, and is not necessarily based on a specific underlying use, application, case, or scenario.
The usage score is mathematically represented by the following equation: τ
1 Γ
S-usg= ψ I Ri (.t)dt
0 where: denotes the usage score,
T denotes time in which the item was observed in the cache, and Ri(t) denotes reference to an item at time t (e.g., number of packets observed in a snapshot).
Reference is now made to FIG. 3, which is a graph depicting computation of the usage score based on the above mathematical equation, in accordance with some embodiments of the present invention. X-axis 302 denotes time t. Y-axis 304 denotes a number of references. Curve 306 denotes current references as a function of time, Ri(t). Curve 308 denotes the accumulated number of references. T denotes the time range for which the usage score is computed.
Referring now back to FIG. 1, at 106, a span score indicative of the effective lifetime of the respective cached entry is computed. The lifetime of the cached entry may be defined as the time from which new item (e.g., flow) is cached to the time during which the item is replaced. The effective lifetime may be computed over multiple snapshots within the sliding window. The effective lifetime may be computed during the sliding window, for example, to identify cached entries that have a lifetime shorter than time T.
The span score is mathematically represented by the following: If there exists t: R^t' = 0 V t < t' < r then
S = t
else
S•-"spn = 0 "
Reference is now made to FIG. 4, which is a graph depicting computation of the span score based on the above mathematical relationship, in accordance with some embodiments of the present invention. X-axis 402 denotes time T. Y-axis 404 denotes a number of references. Curve 406 denotes current references as a function of time. The effective lifetime of the cached entry is represented by portion 408. Note that Sspn 410 is defined as the time during which curve 406 reaches X-axis 402, where it remains zero up to time T. Curve 412 denotes the accumulated number of references. Thus, Sspn is the smallest time for which the usage equals Susg (stops increasing).
Referring now back to 106 of FIG. 1, optionally, a score vector is computed based on at least the computed values of the usage score, and span score. The score vector may include the computed dirtiness score, as described with reference to 108.
The score vector may include weights assigned to each of the usage score and/or span score and/or dirtiness score. The score vector may be processed in two dimensions (when including the usage score and span score), or three dimensions (when further including the dirtiness score). The score vector may be evaluated to arrive at a single dimension, for example, a numerical value computed based on a function applied to the usage score and span score.
The subset of the cached entries is designated for replacement according to the computed score vector, as described with reference to 112. The score vector, which is based on the usage score and the span score, may improve computations, for example, in comparison to independently considering each of the usage score and span score.
At 108, a dirtiness score indicative of whether the respective cached entry is being processed is computed for the cached entries based on the snapshot of the respective cached entry. The dirtiness score handles prescaling. Latency is reduced by postponing replacement of cache entries currently being processed to a future time when the cache entry is not being processed.
The dirtiness score may be computed as follows: if the cached entry is being processed, then the dirtiness score is based on (e.g., equal to) the average time it takes to process the respective cached entry minus the time since processing started, otherwise the value of the dirtiness score is zero.
At 110, the cached entries may in turn be sorted according to each score Sx of the score vector S and a subset determined by the score intensity Ix will be designated. Dirtiness score may be included in the score vector. The sorting may be performed per computed score, of the score vector. The sorting of the scores may be performed by increasing value or decreasing value, depending on the implementation. For example, relatively higher scores may be indicative of cache entries that are actively used and have long life spans, and therefore should not be replaced. Lower scores may be indicative of cache entries that are rarely used and have short life spans, and therefore should be replaced.
At 112, a union of the cached designated entries may be replaced according to the computed values of the usage score and span score, and optionally the dirtiness score. The designation may be performed according to the vector score.
The designation of the subset of the cached entries is performed according to the sorted cached entries based on a replacement score requirement. The replacement score requirement may define a threshold, where cached entries having score values above (or below) the replacement score requirement are replaced, and cached entries having score values below (or above) the replacement score requirement are retained.
In another example, the replacement score requirement may include a predefined number or percentage of cached entries that are to be replaced, for example, 10000 entries, or 10% of all entries.
The size of the subset of the cached items designated for replacement may be defined according to a predefined score intensity that denotes a maximum percentage of cache entries designated for replacement, for example, no more than 10% of all cached entries. The predefined score intensity may include a predefined maximum number of cached entries designated for replacement, for example, no more than 10000 cached entries may be replaced. For example, the top 10000 or the top 10% of cached entries (based on the sorting) having the highest (or lowest) scores are designated.
Designating cached entries for replacement according to the maximum percentage (or predefined number) of cache entries that may be replaced improves performance of the cache in comparison to other methods that decide whether to replace each cache entry independently of other entries.
At 114, the union of the cached items designated for replacement are evicted from the cache. The eviction may be performed at a predefined rate that is independent of other events (e.g., reads/writes to the cache), which improves the real time eviction of cache entries. At 116, blocks 102-108 are iterated, dynamically updating the score vector of each entry according to the scores definition described herein.
At 118, blocks 110-114 are iterated, dynamically replacing cached entries according to the scores described herein, which improves performance of the cache as described herein. The acts of the method described with reference to FIG. 1 may be mathematically represented as follows:
* Input: Cw, β, η, {hi, Is2, Ln} , where:
Cw denotes the number of instances from which aggregation is performed to obtain the scores.
B denotes the time between snapshots,
η denotes the time to replacement,
S denotes the score vector per cached or non-cached entry, and
Ix denotes the replacement intensity for score type X, optionally the percentage of entries for replacement. * Score computation phase (occurs every B seconds) includes aggregating snapshots of cached entries over Cw events, and computing S for each cached entry.
* Replacement phase (occurs every η seconds) includes, for each score Sx in S, sorted items by score Sx, and designate items for replacement according to Ix, and evict union of designated items. Reference is now made to FIG. 5, which is a block diagram depicting components of a system 500 based on the method described with reference to FIG. 1, and/or implemented by system 200 described with reference to FIG. 2, in accordance with some embodiments of the present invention. A cache monitoring device 514 (corresponding to cache monitoring device 214 described with reference to FIG. 2) transmits CacheStatst 550 (i.e., snapshots of cached entries) obtained from a cache (e.g., cache 204 described with reference to FIG. 2).
CacheStatst 550 are provided to computing device 502 (corresponding to computing device 202 described with reference to FIG. 2) for computation of the usage score and span score used for designation of the cache entries for eviction, as described herein. Computing device 502 includes the following modules, which may be implemented as code instructions stored in a memory executed by a processor(s) of computing device 502, and/or implemented as hardware:
* Aggregation 552 that aggregates the CacheStatst (i.e., snapshots) of the cache entries, as described herein.
* Scoring 554 calculates the scores of each cache entry (i.e., usage, span, and/or dirtiness), as described herein.
* Prioritizing 556 sorts the cache entries according to the computed scores, as described herein. * Selection 558 designates (e.g., marks) cache entries, up to the predetermined score intensity as candidates for replacement according to the sorting.
* Replacement 560 evicts the designated cache entries from the cache.
Reference is now made to FIG. 6, which is a dataflow diagram depicting dataflow within a system 600 based on the method described with reference to FIG. 1, and/or implemented by system 200 described with reference to FIG. 2, in accordance with some embodiments of the present invention. A cache monitoring system 614 (corresponding to cache monitoring device 214 described with reference to FIG. 2) transmits CacheStatst 650 (i.e., snapshots) obtained from a cache (e.g., cache 204 described with reference to FIG. 2). CacheStatst 650 are provided to computing device 602 (corresponding to computing device 202 described with reference to FIG. 2) for designation of the cache entries for eviction, as described herein.
At 652, a collection of flow dump results (denoted as {ftl , ft2 >—> ftm}) obtained at times {t-L, t2, ... , tm} is received from monitoring system 614, where ft is the flow dump result at time t. Snapshots of the cache entries are taken from the cache at a constant rate.
At 654, an aggregation of the monitored flow dump results is performed every β seconds. At 656, the aggregated flow (denoted F) is provided. Each aggregated item F is characterized by the computed scores.
At 658, every η seconds, monitoring system 614 obtains a snapshot of the cache.
At 660, the cached entries are selected. At 662, for each computed score (i.e., usage, span, and/or dirtiness) the cache entries undergo block 664 and block 666 as described next.
At 664, the cached entries are sorted according to the current score (ascending or descending, depending on the score sentiment).
At 666, cached entries are marked (i.e., designated) for replacement according to a ratio of Ix denoting the score intensity of the maximum percentage (or number of) cached entries that may be designated for replacement.
At 668, the marked cache entries are evicted from the cache. The cache entries may be removed from the cache by deleting the union of the marked flows.
Reference is now made to FIG. 7, which is a dataflow diagram depicting cached entries designated for eviction according to the usage score and the span score, based on the method described with reference to FIG. 1 , and/or implemented by system 200 described with reference to FIG. 2, in accordance with some embodiments of the present invention. The score intensity denoting the maximum percentage of cached entries designated for replacement is defined for the usage score and for the span score. For example, when cached entries store indications of multiple flows of packets, the intensity denoting the maximum percentage of cached entries designated for replacement is defined for the usage score and for the span score.
Block 702 denotes designation of cached entries for replacement according to the score intensity defined for the usage score. Block 704 denotes sorting of the cached entries according to the span score, and block 706 denotes designation of the cached entries for replacement according to the score intensity defined for the span score. It is noted that sorting may be performed based on one of the elements in the score vector, including the dirtiness score, where the score intensity is according to the intensity vector including the dirtiness intensity. Various implementations and aspects of the systems and/or methods delineated hereinabove and as claimed in the claims section below find experimental support in the following examples.
EXAMPLES
Reference is now made to the following examples, which together with the above descriptions illustrate some implementations of the systems and/or methods described herein in a non-limiting fashion.
Reference is now made to FIG. 8, which is a schematic of a set-up of a test environment used to experimentally measure performance of a cache using the apparatus and/or systems and/or methods described herein, in accordance with some embodiments of the present invention. The testing was performed on a DevStack (DragonFlow, a distributed software defined networking (SDN) controller supporting distributed switching and routing) - single host. The cache being evaluated was used by OVS 2.5. The cache flow size limit was set to 1000 (the default is 200K). The network topology included a total of 50 virtual machines (VMs) in 3 networks.
A full mesh stress test was performed using NetPerf TCP CRR (open multiple TCP connections), sending random number of bytes (1-32768 per connection). NetPerf is a benchmark that is used to measure the performance of different types of networks.
Reference is now made to FIG. 9, which is a table summarizing the experimental results for 30 targets per virtual machine, for the experiment performed using the systems and/or methods described herein, using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention. The experimental results show that the miss rate of incoming packets into the data path looking for matching cache flows when the cache is flooded due to high rate and high entropy traffic, where the cache is used by an OVS, is reduced relative to the existing LRU cache replacement policy.
The table is interpreted using the following:
* The smaller the usage score, the more mouse the flow.
* The span score denotes the effective lifetime of the flow. * The best and 2% off results are shown in Bold.
* The worst and 2% above results are shown in Italic.
The following structural conclusions for the specific deployment setup as depicted in FIG. 8 are reached based on the table: * To optimize Recall, use a usage score intensity of 60%, and consider a span score intensity of 60%.
* To optimize Throughput, use a usage score intensity of 40-60%).
* To optimize Latency, use a usage score intensity of 40-60%), and consider using a span score intensity of 40. * To optimize Jitter, use a usage score intensity of 40%>.
* To obtain overall improved results relative to LRU, a best practice is to use a usage score intensity of 40-60%>.
Reference is now made to FIG. 10, which is a graph depicting miss rate as a function of the number of targets per virtual machine for the experiment performed using the systems and/or methods described herein, using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention. The input parameters for the algorithm are encoded to a string as follows:
Three graphs are shown in FIG 10, "baseline", "algo 0.5_60_60_0_10000_2_" and "algo minima". The legend "baseline" refers to a run without eviction; that is, 0%> intensity to each of the scores. The legend "algo 0.5_60_60_0_10000_2_" refers to a run of η = 0.5, l
usg = 60%, I
spn = 60%, I
lru = 0%, C
w = 10000 and β = 2. The legend "algo minima" refers to a collection of all setups that yielded the minimum miss rate each, given the number of target per VM. Each point in the "algo minima" curve is a result of a single setup of the input parameters for the algorithm, and thus is shown along the curve as "Params". In addition to the "Params" notation, the "Distance" specify the distance of the curve that would be drawn from that setup along the x- axis for variable number of targets per VM.
Reference is now made to FIG. 11, which is a graph depicting latency as a function of the number of targets per virtual machine for the experiment performed using the systems and/or methods described herein, using the set-up depicted in FIG. 8, in accordance with some embodiments of the present invention. Based on the graph, the following conclusions may be reached: for 25-35 VMs per host use a usage score intensity of 40%, for over 35 VMs per host use a LRU score intensity of 40%.
Other systems, methods, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
It is expected that during the life of a patent maturing from this application many relevant caches will be developed and the scope of the term cache is intended to include all such new technologies a priori.
As used herein the term "about" refers to ± 10 %.
The terms "comprises", "comprising", "includes", "including", "having" and their conjugates mean "including but not limited to". This term encompasses the terms "consisting of and "consisting essentially of.
The phrase "consisting essentially of means that the composition or method may include additional ingredients and/or steps, but only if the additional ingredients and/or steps do not materially alter the basic and novel characteristics of the claimed composition or method.
As used herein, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a compound" or "at least one compound" may include a plurality of compounds, including mixtures thereof. The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments. The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". Any particular embodiment of the invention may include a plurality of "optional" features unless such features conflict.
Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases "ranging/ranges between" a first indicate number and a second indicate number and "ranging/ranges from" a first indicate number "to" a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements. All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting.