US11841797B2

Movatterモバイル変換

Info

Publication number: US11841797B2
Application number: US17/684,450
Authority: US
Inventors: Shirish Vijayvargiya
Original assignee: VMware LLC
Current assignee: VMware LLC
Priority date: 2022-01-13
Filing date: 2022-03-02
Publication date: 2023-12-12
Also published as: US20230251967A1

Abstract

The disclosure provides an approach for content based read cache (CBRC) digest file creation. Embodiments include determining a mapping between entries in a CBRC and physical block addresses (PBAs) associated with a source virtual machine (VM). Embodiments include creating a clone VM based on the source VM. Embodiments include, for each data block associated with the clone VM: determining a PBA associated with a logical block address (LBA) of the data block, determining, based on the mapping, whether data associated with the PBA is cached in the CBRC, and, if the data associated with the PBA is cached in the CBRC, copying a hash of the data from a first digest file of the source VM to a second digest file of the clone VM and associating the hash with the LBA in the second digest file.

Description

RELATED APPLICATION

Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 202241002047 filed in India entitled “OPTIMIZING INSTANT CLONES THROUGH CONTENT BASED READ CACHE”, on Jan. 13, 2022, by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.

BACKGROUND

Virtualized computing systems provide efficiency and flexibility for system operators by enabling computing resources to be deployed and managed as needed to accommodate specific applications and capacity requirements. As virtualized computing systems mature and achieve broad market acceptance, demand continues for increased performance of virtual endpoints and increased overall system efficiency.

A virtualized computing system involves multiple hosts in communication over a physical network infrastructure. Each host has one or more virtualized endpoints such as virtual machines (VMs), containers, or other virtual computing instances (VCIs). The virtualized endpoints can be connected to logical overlay networks. A logical overlay network may span multiple hosts and is decoupled from the underlying physical network infrastructure.

Hosts are configured to provide a virtualization layer, also referred to as a hypervisor. The virtualization layer abstracts processor, memory, storage, and networking resources into multiple virtual endpoints that run concurrently on the same host. Each VM may be configured to store and retrieve file system data within a corresponding storage system. Relatively slow access latencies associated with storage, such as hard disk drives implementing the storage system, give rise to a bottleneck in file system performance, reducing overall system performance.

A clone VM is a copy of another VM. In some embodiments, a clone VM refers to an instant clone. Cloning creates a VM from the running state of another VM resulting in a destination VM (i.e., the clone VM) that is identical to the source VM (i.e., the parent VM). For example, the clone VM has a processor state, virtual device state, memory state, and disk state identical to the source VM from which it is cloned at the instant it is cloned. However, because clone VMs typically share an underlying physical storage system with a parent VM (e.g., in some cases sharing some or all of the same underlying physical storage blocks), the creation of clone VMs can lead to performance degradation as the number of storage input/output (I/O) requests directed to the underlying physical storage system increases.

In some cases, to improve storage I/O performance, a cache may be stored in physical memory (e.g., random access memory (RAM)) configured within a host. The cache acts as a small, fast memory that stores recently accessed data items and can be used to satisfy data requests without accessing the storage. Accordingly, data requests satisfied by the cache are executed with less latency as the latency associated with accessing the storage is avoided. In some cases, a content based read cache (CBRC) may be used. A CBRC caches data such that the key used to retrieve data stored in the CBRC is based on a function of the data itself, and not a block address associated with the data. In particular, a hash of the data is stored as a key used to retrieve the actual data associated with the hash. Therefore, regardless of the block address indicated in an I/O request, such as a read I/O, the read I/O can be serviced from the CBRC, instead of storage, if the data associated with the block address is in the CBRC. For example, particular data may be the same for multiple block addresses, and therefore any read I/O that references any such block address may be serviced from CBRC when it stores the particular data.

To enable CBRC use, a virtual disk of a VM may be associated with a digest file, which is a cryptographical representation of the virtual disk that stores metadata about each data block of the virtual disk. In particular, for each such data block, a corresponding unique hash of the data (also referred to as content) of the data block is generated, for example by using a cryptographic hashing algorithm such as the secure hashing algorithm (SHA)-1 algorithm, and the hash is stored in the digest file. The digest file maintains a mapping of each block address (e.g., LBA) of each data block in the virtual disk to a corresponding hash. Thus, if a read I/O is received for an LBA associated with the VM, the digest file may be used to determine the hash that corresponds the LBA, and the hash may be used to access cached data in the CBRC.

However, when a clone VM is created from a source VM according to current techniques, there is not yet a digest file for the one or more virtual disks of the clone VM (e.g., since the virtual disks of the clone VM have different LBAs than corresponding virtual disks of the source VM). Generating a digest file can be a resource-intensive and time-consuming process, as it involves computing cryptographic hashes of data in each data block of a virtual disk, thereby requiring a read I/O and a hash computation for each data block. Creating digest files for clone VMs may add an excessive amount of load on physical computing resources, particularly when a large number of clone VMs are created. Furthermore, creating a digest file for a clone VM created through an instant clone process may detract from the benefits of an instant clone as the digest file creation process will cause poor performance or non-functionality of the clone VM for a time.

As such, there is a need in the art for improved techniques of enabling use of a CBRC for clone VMs.

SUMMARY

Embodiments provide a method for content based read cache (CBRC) digest file creation. Embodiments include determining a mapping between entries in a CBRC and physical block addresses (PBAs) associated with a source virtual machine (VM); creating a clone VM based on the source VM; and for each data block associated with the clone VM: determining a PBA associated with a logical block address (LBA) of the data block; determining, based on the mapping, whether data associated with the PBA is cached in the CBRC; and if the data associated with the PBA is cached in the CBRC, copying a hash of the data from a first digest file of the source VM to a second digest file of the clone VM and associating the hash with the LBA in the second digest file.

Further embodiments include a non-transitory computer-readable storage medium storing instructions that, when executed by a computer system, cause the computer system to perform the method set forth above, and a computer system programmed to carry out the method set forth above.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG.1 depicts a host computing system in accordance with embodiments of the present disclosure.

FIG.2 is an illustration of an example related to content based read cache (CBRC) digest file generation for clone virtual machines (VMs).

FIG.3 is an illustration of an example related to utilizing a content based read cache (CBRC) for clone virtual machines (VMs) with a digest file created as described herein.

FIG.4 depicts example operations related to content based read cache (CBRC) digest file generation for clone virtual machines (VMs).

To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.

DETAILED DESCRIPTION

The present disclosure provides an approach for enabling use of a content based read cache (CBRC) for clone virtual machines (VMs). In particular, a digest file is created for a clone VM by using mappings between logical block addresses (LBAs) and physical block addresses (PBAs) to identify and copy relevant information from a digest file for a source VM's virtual disk to the digest file for the clone VM's virtual disk. Thus, when creating a digest file for a clone VM's virtual disk according to embodiments of the present disclosure, hash computations only need to be performed for data blocks associated with the clone VM's virtual disk that were not already included in the digest file for the source VM's virtual disk.

By utilizing LBA-to-PBA mappings to determine which information can be copied from a digest file associated with a source VM to a digest file for a clone VM, techniques described herein significantly reduce processing resource utilization and storage I/O load required to generate a digest file for a clone VM. Accordingly, because embodiments of the present disclosure allow a digest file to be created for a clone VM in a fast and resource-efficient manner, digest files may be created as described herein for clone VMs created using an instant clone process without inhibiting or delaying the performance of the clone VMs. Creating digest files for clone VMs as described herein allows a CBRC to be utilized for the clone VMs, thereby reducing load on the underlying storage system and improving performance of the computing devices involved.

FIG.1 depicts ahost computing system100 in accordance with embodiments of the present disclosure.Host computing system100 is representative of a virtualized computer architecture. As shown,host computing system100 includes ahost102 andstorage116.

Host

102 may be constructed on a servergrade hardware platform106, such as an x86 architecture platform.Host102 is configured to provide a virtualization layer, also referred to as ahypervisor104, that abstracts processor, memory, storage, and networking resources ofhardware platform106 into multiple virtual machines (VMs)103₁to103_n(collectively referred to as VMs103 and individually referred to as VM103) that run concurrently on thesame host102. Though certain techniques herein are discussed with respect to VMs, they may similarly be applicable to other suitable virtual computing instances (VCIs), such as containers, virtual appliances, and/or the like. In some embodiments, VMs103₂to103_nare clone VMs created (e.g., using an instant clone process) from VM103₁, which may be a source VM. For example,103₂to103_nmay have been created for a virtual desktop infrastructure (VDI) environment. In a VDI environment, a local client device can access and display a remote virtual or physical desktop or remote application that is running on a remote device. For instance, a virtual desktop (e.g., a representing a desktop of a VM) may be hosted on a central infrastructure known as a VDI, and may be rendered on a client device using a remote display protocol.

Storage

116 provides VMs access to consolidated, block-level data storage. In one embodiment,storage116 is a virtual storage area network (vSAN) that aggregates local or direct-attached capacity devices of a host cluster and creates a single storage pool shared across all hosts in the cluster. In another embodiment,storage116 is storage directly coupled to host102. In another embodiment,storage116 includeslocal storage114 inhardware platform106.

Storage

116 manages storage of data at a block granularity. For example,storage116 is divided into a number of physical blocks (e.g., 4096 bytes or “4K” size blocks), each physical block having a corresponding physical block address (PBA) that indexes the physical block in storage. The physical blocks ofstorage116 are used to store blocks of data (also referred to as data blocks) used byVMs103, which may be referenced by logical block addresses (LBAs), as discussed herein. Each block of data may have an uncompressed size corresponding to a physical block. Blocks of data may be stored as compressed data or uncompressed data instorage116, such that there may or may not be a one to one correspondence between a physical block onstorage116 and a data block referenced by a logical block address.

Storage

116 receives I/O requests for a data block from a VM, which the VM refers to using a guest LBA that is in an address space used by the VM to address blocks of data. Such an LBA may be referred to as an LBA of the data block. Different VMs may use the same LBA to refer to different data blocks, as the LBA is specific to the VM.

Storage

116 stores the data block in a physical block. The physical block where the data block is stored is referred to as a physical block of the data block. The physical block of the data block is addressed by a PBA corresponding to the physical block. One or more mapping tables such as page table(s)112 may be used to map the relationship between an LBA and its corresponding PBA, as is known in the art.

One or more virtual disks backed by virtual disk files140 stored instorage116 each has a separate associated digest file138 that is created and stored instorage116. Techniques for creation of digest files for clone VMs are described herein. A digestfile138 is a cryptographical representation of the virtual disk and stores metadata about each data block of thevirtual disk140. In particular, for each such data block, a corresponding unique hash of the data (also referred to as content) of the data block is generated, for example by using a cryptographic hashing algorithm such as the SHA-1 algorithm, and the hash is stored in the digest file. The digestfile138 maintains a mapping of each block address (e.g., LBA) of each data block in the virtual disk to a corresponding hash. For example, the digest file138 stores tuples of <LBA, hash of data referenced by the LBA, validity bit>, where the LBA is the key. The validity bit indicates whether the particular LBA is “valid” or “invalid.” An LBA is valid if there is actual data stored invirtual disk file140/storage116 that is addressed by the LBA. An LBA is invalid if there is no data stored invirtual disk file140/storage116 that is addressed by the LBA (e.g., due to deletion, initialization, etc.). According to embodiments of the present disclosure, a digestfile138 may also include PBA toCBRC mapping114. For example, a digest file may be extended to include a PBA for each LBA to hash mapping, thereby mapping PBAs to CBRC entries (which are indexed by hashes). In alternative embodiments, PBA toCBRC mapping114 is stored separately from the digest file138.

Hypervisor

104 runs in conjunction with an OS (not shown) inhost102.Hypervisor104 can be installed as system level software directly onhardware platform106 of host102 (often referred to as “bare metal” installation) and be conceptually interposed between the physical hardware and the guest OSs executing in theVMs103. One or more storage management modules may also run onhost102, and may handle processing of read and write I/Os, such as using techniques described herein.

Hardware platform

106 ofhost102 includes physical resources of a computing device, such as amemory108 andlocal storage114.Hardware platform106 may include other physical resources of a computing devices, not shown inFIG.1, such as one or multiple processor(s), accelerator(s), a disk interface, and/or a network interface. As discussed, physical resources ofhost102 can be abstracted into a number ofVMs103 byhypervisor106, such that the physical resources of thehost102 may be shared byVMs103 residing onhost102.

Local storage

114 may include one or more hard disks, flash memory modules, solid state disks, and/or optical disks.Memory108 may include, for example, one or more RAM modules.Memory108 contains aCBRC110, such as in one or more reserved memory spaces.Memory108 also contains in-memory copies of page table(s)112 and digest file(s)138 (including PBA to CBRC entry mapping114), described herein. For example, when aVM103 is powered on, the page table(s)112 and digest file(s)138 associated with the virtual disk(s)140 of theVM103 may be loaded intomemory108 such that there is an in-memory copy of the page table(s)112 and digest file(s)138 accessible, thereby avoiding I/O tostorage116.

CBRC

110 is generally a cache for data (e.g., corresponding to contents of data blocks of virtual disks backed by virtual disk files) accessed byVMs103.CBRC110 may be implemented as a virtual small computer system interface (vSCSI) that maintains a global cache onhost102 to serve read I/O requests forVMs103. Data stored inCBRC110 is not tied to anyparticular VM103 and may be shared acrossVMs103 onhost102, thus, this implementation allows for the detection of duplicate content of data blocks acrossVMs103 running onhost102, as well as the servicing of I/Os acrossVMs103 running onhost102 fromCBRC110.

As shown,CBRC110 includes least recently used (LRU)cache128, which stores data forCBRC110 that are indexed by CBRC lookup table126. LRU is a cache eviction strategy, wherein if the cache size has reached the maximum allocated capacity, the least recently accessed object(s) in the cache will be evicted from the cache. Data inCBRC LRU cache128 are maintained in LRU order. It is noted that LRU is included as an example, and other eviction strategies may be used.CBRC110 manages theLRU cache128 by monitoring capacity of theLRU128 when accommodating new data corresponding to blocks of data inCBRC LRU cache128. When capacity is reached,CBRC110 evicts least recently accessed data first.

CBRC

110 further includes CBRC lookup table126. CBRC lookup table126 indexes the data stored inLRU cache128 by hashes of data. For example, CBRC lookup table126 stores tuples of <hash of data, memory location of data>, where the hash is the key. Thus, given a hash of data, the CBRC lookup table126 can be used to determine if the data is stored inLRU cache128, and if so, the location of the data (e.g., in memory108) such that it can be retrieved fromLRU cache128.

For example, if a read I/O indicating an LBA of avirtual disk file140 is received from aVM103, the digest file138 (e.g., in-memory copy) of thevirtual disk file140 can be used to retrieve a hash of the data of a data block stored instorage116 as associated with the LBA. The hash of the data is used to search the CBRC lookup table126 to determine if there is a matching hash, such that the read I/O can be serviced from theCBRC110, instead of issuing the read I/O to thevirtual disk file140 instorage116. It should be noted that hashes of data stored in digest file138, CBRC lookup table126, and other data structures discussed herein are consistent such that for a given data, the same hash is stored in each data structure. For example, the hashes are generated using a same hashing algorithm for each data structure. In some embodiments, the data stored inCBRC110 is deduplicated, meaning that a given data is stored only once inCBRC110, regardless of the number of LBAs for which I/Os are received that are associated with the given data. This helps to reduce the amount of memory needed to implementCBRC110.

As described in more detail below with respect toFIG.2, PBA toCBRC entry mapping114 may be created (e.g., added to a digest file138) for avirtual disk140 corresponding to a source VM (e.g., VM103₁) based on a page table112 for thevirtual disk140. The PBA toCBRC entry mapping114 may then be used to identify and copy relevant information (e.g., hashes) from the digest file138 for thevirtual disk140 to a new digest file138 for avirtual disk140 of a clone VM (e.g., VM103₂). Thus, digest files may be efficiently created for clone VMs without requiring redundant computation of hashes for data blocks that have already been cached for the source VM.

Once a digestfile138 has been created for a clone VM, the digest file138 can then be used to respond to read requests from the clone VM viaCBRC110, as described in more detail below with respect toFIG.3.

FIG.2 is anillustration200 of an example related to content based read cache (CBRC) digest file generation for clone virtual machines (VMs).Illustration200 includes

virtual disks

140₁and140₂, page tables112₁and112₂, and digest

files

138₁and138₂, which correspond to virtual disk file(s)140, page table(s)112, and digest file(s)138 ofFIG.1, described above.

Digest file

138₁corresponds to a source VM, such asVM103₁ofFIG.1, from which a clone VM was created, such as using an instant clone process. According to existing techniques, digestfile138₁includes a mapping of LBAs212 tohashes214. Certain techniques described herein involve extending digestfile138₁to includePBAs216 in association withLBAs212 and hashes214.

In an example,LBAs208 of a source VM virtual disk140₁(e.g., the LBAs of all data blocks in the virtual disk) are determined. A page table112₁of the source VM is then used to determine the PBAs210 that correspond to LBAs208 based on LBA-to-PBA mappings that are included in page table112₁. A mapping betweenPBAs210 and CBRC entries (e.g., indexed by hashes214) is then created, such as by adding acorresponding PBA210 to each entry in digest file138₁, thereby resulting in mappings between LBAs214,hashes214, and PBAs216 in digest file138₁. In alternative embodiments, a mapping betweenPBAs216 andhashes214 are in stored separately from digestfile138₁. Furthermore, while not shown, eachLBA214 and/orPBA216 may be associated (e.g., in digest file138₁) with an identifier of a storage device to which theLBA214 and/orPBA216 corresponds, such as a physical storage device identifier and/or logical volume management (LVM) identifier.

After a clone VM is created, such asVM103₂ofFIG.1, a digestfile138₂for avirtual disk140₂of the clone VM is created according to techniques described herein. In an example,LBAs202 of the clone VM virtual disk140₂(e.g., the LBAs of all data blocks in the virtual disk) are determined. A page table112₂of the source VM is then used to determine the PBAs204 that correspond to LBAs202 based on LBA-to-PBA mappings that are included in page table112₂.PBAs204 are then used to identify information (e.g., hashes206) that can be copied from digestfile138₁to digestfile138₂.

In some embodiments, for eachPBA204, digestfile138₁is searched for thePBA204 and, if thePBA204 is included in digest file138₁(e.g., in PBAs216), then thehash214 that corresponds to thePBA204 in digest file138₁is copied to digest file138₂(e.g., and stored in hashes224). If thePBA204 is not included in digest file138₁, then the data corresponding to thePBA204 is retrieved from storage and a hash of the data is computed and stored in digest file138₂(e.g., in hashes224). The data may also be cached in the CBRC at this time (e.g., since the data is read from storage in order to compute the hash, it may be stored in the CBRC along with the hash).

Thus, hashes206 representhashes214 that correspond to PBAs204 and are therefore copied from digestfile138₁to digestfile138₂.Digest file138₂, once created according to techniques described herein, comprises mappings of LBAs222 tohashes224 andPBAs226. For example, eachhash206 that is copied from digestfile138₁and/or each hash that is computed (e.g., as needed when not already present in digest file138₁) is stored in digest file138₂file in association with the correspondingLBA202 andPBA204 used to retrieve and/or compute the hash.

It is noted that the process described herein with respect toillustration200 may be repeated for each virtual disk associated with a VM (e.g., if a VM has multiple virtual disks) in order to create a digest file for each virtual disk.

FIG.3 is anillustration300 of an example related to utilizing a content based read cache (CBRC) for clone virtual machines (VMs) with a digest file created as described herein.Illustration300 includesVM103₂andCBRC110 ofFIG.1 and digestfile138₂ofFIG.2. For example, digestfile138₂may have been created for a virtual disk of VM103₂(e.g., a clone VM) as described above with respect toFIG.2.

VM

103₂may represent a clone VM created using an instant clone process. In certain aspects, instant cloning uses rapid in-memory cloning of a running source VM, and copy-on-write (COW) to rapidly deploy the clone VM. To create the clone VM, the source VM is stunned for a short period of time (e.g., less than 1 second) and brought to a quiescent state. While the source VM is stunned, a new writable delta disk is generated for each virtual disk of the VM, such that each virtual disk is represented by a base disk and a delta disk. A base disk of a virtual disk of the VM includes data of the virtual disk before the clone VM is made of the source VM. The delta disk is used to store data corresponding to writes to the virtual disk that occur after the clone VM is made of the source VM. The source VM and the clone VM share the base disk of the virtual disk, which may be put in a read-only state. However, each of the source VM and the clone VM may have its own respective delta disk where writes to virtual disk are made from the source VM and the clone VM, respectively. Thus, read I/O requests from both the source VM and the clone VM are served from the base disk for shared data blocks, or from the respective delta disk for modified data blocks after the cloning, while write I/Os of the source VM are written to the delta disk of the source VM and write I/Os of the clone VM are written to the delta disk of the clone VM. Accordingly, if the clone VM modifies data on the virtual disk, the data on the source VM is not modified, thus preserving security and isolation between the source VM and the clone VM.

Regardless of whether a read I/O from the clone VM refers to an LBA that corresponds to the base disk or the delta disk of the clone VM, creating a digest file for the clone VM as described herein allows the read i/O to be served from the CBRC if the requested data is cached in the CBRC. As such, techniques described herein may greatly reduce load on the underlying storage system, particularly when multiple clone VMs are created. The digest file for a clone VM may be created according to embodiments of the present disclosure at the time the instant clone is performed, thereby allowing the CBRC to be utilized as soon as the clone VM is created.

Inillustration300,VM103₂issues aread request310 comprising an LBA312 (e.g., indicating the LBA from which data is requested to be read). The digest file138₂(e.g., in memory) of a virtual disk ofVM103₂is used to determine whether the data corresponding toLBA312 has been cached in the CBRC. For example, if digestfile138₂contains anentry comprising LBA312, then thehash320 associated withLBA312 in the entry is retrieved. The retrievedhash320 is used to find a memory location of the requested data inCBRC110, such as using CBRC lookup table126 ofFIG.1. Ifhash320 is found in the CBRC lookup table, the requesteddata330 may be retrieved fromCBRC110 and returned toVM103₂. Ifhash320 is not found in the CBRC lookup table, the requested data may be retrieved from thevirtual disk file140 instorage116 ofFIG.1 (e.g., using the PBA that is mapped toLBA312 in a corresponding page table).

Though certain aspects herein describe I/Os with respect to blocks, they may similarly be applicable to WOs for pages, where a page comprises multiple blocks.

Aspects of the present disclosure may provide a significant benefit to virtual desktop environments (VDIs) experiencing boot storms. A VDI boot storm is the degradation of service that occurs when a significant number of virtual endpoints boot up within a narrow time frame and overwhelm the network with data requests. Boot files of each of the VMs may be the same, thus, aspects described herein may help to ensure data block content accessed by VMs is served fromCBRC110 whenever possible rather than being retrieved from storage.

FIG.4 depictsexample operations400 related to related to content based read cache (CBRC) digest file generation for clone virtual machines (VMs). For example,operations400 may be performed by a storage management module onhost102 ofFIG.1.

Operations

400 begin atstep402, with determining a mapping between entries in a CBRC and physical block addresses (PBAs) associated with a source virtual machine (VM).

In some embodiments, determining the mapping between entries in the CBRC and the PBAs associated with the VM is based on the first digest file and a page table associated with the source VM that maps LBAs to PBAs. In certain embodiments, the PBAs are associated in the mapping with one or more identifiers of one or more storage devices.

According to some embodiments, the mapping is stored in the first digest file. In other embodiments, the mapping is stored separately from the first digest file.

Operations

400 continue atstep404, with creating a clone VM based on the source VM. In certain embodiments, creating the clone VM based on the source VM comprises performing an instant clone operation in which the clone VM is created from a running state of the source VM while the source VM is powered on, and wherein the clone VM is created in a powered on state. For example, the instant clone operation may be performed as part of a virtual desktop interface (VDI) deployment.

Operations

400 continue atstep406, with, for each data block associated with the clone VM, performing

steps

408,410,412, and414.

Step408 comprises determining a PBA associated with a logical block address (LBA) of the data block.

Step410 comprises determining, based on the mapping, whether data associated with the PBA is cached in the CBRC.

Step412 comprises, if the data associated with the PBA is cached in the CBRC, copying a hash of the data from a first digest file of the source VM to a second digest file of the clone VM and associating the hash with the LBA in the second digest file.

Step414 comprises, if the data associated with the PBA is not cached in the CBRC, computing the hash of the data and associating the hash with the LBA in the second digest file. If the data associated with the PBA has not been cached in the CBRC, some embodiments further comprise creating a new entry in the CBRC comprising the data associated with the hash and storing a new mapping between the new entry in the CBRC and the PBA.

Certain embodiments further comprise receiving a storage read request comprising a given LBA associated with the clone VM and retrieving, based on the storage read request, the given data from the CBRC using a given hash associated with the LBA in the second digest file.

The various embodiments described herein may employ various computer-implemented operations involving data stored in computer systems. For example, these operations may require physical manipulation of physical quantities-usually, though not necessarily, these quantities may take the form of electrical or magnetic signals, where they or representations of them are capable of being stored, transferred, combined, compared, or otherwise manipulated. Further, such manipulations are often referred to in terms, such as producing, identifying, determining, or comparing. Any operations described herein that form part of one or more embodiments of the invention may be useful machine operations. In addition, one or more embodiments of the invention also relate to a device or an apparatus for performing these operations. The apparatus may be specially constructed for specific required purposes, or it may be a general purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general purpose machines may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.

The various embodiments described herein may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like.

One or more embodiments of the present invention may be implemented as one or more computer programs or as one or more computer program modules embodied in one or more computer readable media. The term computer readable medium refers to any data storage device that can store data which can thereafter be input to a computer system—computer readable media may be based on any existing or subsequently developed technology for embodying computer programs in a manner that enables them to be read by a computer. Examples of a computer readable medium include a hard drive, network attached storage (NAS), read-only memory, random-access memory (e.g., a flash memory device), a CD (Compact Discs)—CD-ROM, a CD-R, or a CD-RW, a DVD (Digital Versatile Disc), a magnetic tape, and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.

Although one or more embodiments of the present invention have been described in some detail for clarity of understanding, it will be apparent that certain changes and modifications may be made within the scope of the claims. Accordingly, the described embodiments are to be considered as illustrative and not restrictive, and the scope of the claims is not to be limited to details given herein, but may be modified within the scope and equivalents of the claims. In the claims, elements and/or steps do not imply any particular order of operation, unless explicitly stated in the claims.

Virtualization systems in accordance with the various embodiments may be implemented as hosted embodiments, non-hosted embodiments or as embodiments that tend to blur distinctions between the two, are all envisioned. Furthermore, various virtualization operations may be wholly or partially implemented in hardware. For example, a hardware implementation may employ a look-up table for modification of storage access requests to secure non-disk data.

Certain embodiments as described above involve a hardware abstraction layer on top of a host computer. The hardware abstraction layer allows multiple contexts to share the hardware resource. In one embodiment, these contexts are isolated from each other, each having at least a user application running therein. The hardware abstraction layer thus provides benefits of resource isolation and allocation among the contexts. In the foregoing embodiments, virtual machines are used as an example for the contexts and hypervisors as an example for the hardware abstraction layer. As described above, each virtual machine includes a guest operating system in which at least one application runs. It should be noted that these embodiments may also apply to other examples of contexts, such as containers not including a guest operating system, referred to herein as “OS-less containers” (see, e.g., www.docker.com). OS-less containers implement operating system-level virtualization, wherein an abstraction layer is provided on top of the kernel of an operating system on a host computer. The abstraction layer supports multiple OS-less containers each including an application and its dependencies. Each OS-less container runs as an isolated process in userspace on the host operating system and shares the kernel with other containers. The OS-less container relies on the kernel's functionality to make use of resource isolation (CPU, memory, block I/O, network, etc.) and separate namespaces and to completely isolate the application's view of the operating environments. By using OS-less containers, resources can be isolated, services restricted, and processes provisioned to have a private view of the operating system with their own process ID space, file system structure, and network interfaces. Multiple containers can share the same kernel, but each container can be constrained to only use a defined amount of resources such as CPU, memory and I/O. The term “virtualized computing instance” as used herein is meant to encompass both VMs and OS-less containers.

Many variations, modifications, additions, and improvements are possible, regardless the degree of virtualization. The virtualization software can therefore include components of a host, console, or guest operating system that performs virtualization functions. Plural instances may be provided for components, operations or structures described herein as a single instance. Boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention(s). In general, structures and functionality presented as separate components in exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the appended claim(s).

Claims

What is claimed is:

1. A method for content based read cache (CBRC) digest file creation, comprising:

determining a mapping between entries in a CBRC and physical block addresses (PBAs) associated with a source virtual machine (VM);

creating a clone VM based on the source VM; and

for each data block associated with the clone VM:

determining a PBA associated with a logical block address (LBA) of the data block;

determining, based on the mapping, whether data associated with the PBA is cached in the CBRC; and

if the data associated with the PBA is cached in the CBRC, copying a hash of the data from a first digest file of the source VM to a second digest file of the clone VM and associating the hash with the LBA in the second digest file.

2. The method ofclaim 1, further comprising, if the data associated with the PBA is not cached in the CBRC, computing the hash of the data and associating the hash with the LBA in the second digest file.

3. The method ofclaim 2, further comprising, if the data associated with the PBA is not cached in the CBRC:

creating a new entry in the CBRC comprising the data associated with the hash; and

storing a new mapping between the new entry in the CBRC and the PBA.

4. The method ofclaim 1, further comprising:

receiving a storage read request comprising a given LBA associated with the clone VM; and

retrieving, based on the storage read request, given data from the CBRC using a given hash associated with the LBA in the second digest file.

5. The method ofclaim 1, wherein determining the mapping between entries in the CBRC and the PBAs associated with the source VM is based on the first digest file and a page table associated with the source VM that maps LBAs to PBAs.

6. The method ofclaim 1, wherein creating the clone VM based on the source VM comprises performing an instant clone operation in which the clone VM is created from a running state of the source VM while the source VM is powered on, and wherein the clone VM is created in a powered on state.

7. The method ofclaim 6, wherein the instant clone operation is performed as part of a virtual desktop interface (VDI) deployment.

8. The method ofclaim 1, wherein the PBAs are associated in the mapping with one or more identifiers of one or more storage devices.

9. The method ofclaim 1, wherein the mapping is stored in the first digest file.

10. A system for content based read cache (CBRC) digest file creation, the system comprising:

at least one memory; and

at least one processor coupled to the at least one memory, the at least one processor and the at least one memory configured to:

determine a mapping between entries in a CBRC and physical block addresses (PBAs) associated with a source virtual machine (VM);

create a clone VM based on the source VM; and

for each data block associated with the clone VM:

determine a PBA associated with a logical block address (LBA) of the data block;

determine, based on the mapping, whether data associated with the PBA is cached in the CBRC; and

if the data associated with the PBA is cached in the CBRC, copy a hash of the data from a first digest file of the source VM to a second digest file of the clone VM and associating the hash with the LBA in the second digest file.

11. The system ofclaim 10, wherein the at least one processor and the at least one memory are further configured to, if the data associated with the PBA is not cached in the CBRC, compute the hash of the data and associating the hash with the LBA in the second digest file.

12. The system ofclaim 11, wherein the at least one processor and the at least one memory are further configured to, if the data associated with the PBA is not cached in the CBRC:

create a new entry in the CBRC comprising the data associated with the hash; and

store a new mapping between the new entry in the CBRC and the PBA.

13. The system ofclaim 10, wherein the at least one processor and the at least one memory are further configured to:

receive a storage read request comprising a given LBA associated with the clone VM; and

retrieve, based on the storage read request, given data from the CBRC using a given hash associated with the LBA in the second digest file.

14. The system ofclaim 10, wherein determining the mapping between entries in the CBRC and the PBAs associated with the source VM is based on the first digest file and a page table associated with the source VM that maps LBAs to PBAs.

15. The system ofclaim 10, wherein creating the clone VM based on the source VM comprises performing an instant clone operation in which the clone VM is created from a running state of the source VM while the source VM is powered on, and wherein the clone VM is created in a powered on state.

16. The system ofclaim 15, wherein the instant clone operation is performed as part of a virtual desktop interface (VDI) deployment.

17. The system ofclaim 10, wherein the PBAs are associated in the mapping with one or more identifiers of one or more storage devices.

18. The system ofclaim 10, wherein the mapping is stored in the first digest file.

19. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to:

create a clone VM based on the source VM; and

for each data block associated with the clone VM:

20. The non-transitory computer-readable medium ofclaim 19, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to, if the data associated with the PBA is not cached in the CBRC, compute the hash of the data and associating the hash with the LBA in the second digest file.