Movatterモバイル変換


[0]ホーム

URL:


US11016692B2 - Dynamically switching between memory copy and memory mapping to optimize I/O performance - Google Patents

Dynamically switching between memory copy and memory mapping to optimize I/O performance
Download PDF

Info

Publication number
US11016692B2
US11016692B2US16/567,747US201916567747AUS11016692B2US 11016692 B2US11016692 B2US 11016692B2US 201916567747 AUS201916567747 AUS 201916567747AUS 11016692 B2US11016692 B2US 11016692B2
Authority
US
United States
Prior art keywords
memory
data transfer
transfer technique
request
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US16/567,747
Other versions
US20210072918A1 (en
Inventor
Lokesh M. Gupta
Kevin J. Ash
Brian A. Rinaldi
Kyler A. Anderson
Matthew J. Kalos
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines CorpfiledCriticalInternational Business Machines Corp
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATIONreassignmentINTERNATIONAL BUSINESS MACHINES CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: ANDERSON, KYLER A., ASH, KEVIN J., GUPTA, LOKESH M., RINALDI, BRIAN A., KALOS, MATTHEW J.
Priority to US16/567,747priorityCriticalpatent/US11016692B2/en
Priority to GB2203249.4Aprioritypatent/GB2602404B/en
Priority to DE112020003721.5Tprioritypatent/DE112020003721B4/en
Priority to CN202080049597.2Aprioritypatent/CN114127699B/en
Priority to JP2022515916Aprioritypatent/JP7495191B2/en
Priority to PCT/IB2020/058197prioritypatent/WO2021048709A1/en
Publication of US20210072918A1publicationCriticalpatent/US20210072918A1/en
Publication of US11016692B2publicationCriticalpatent/US11016692B2/en
Application grantedgrantedCritical
Expired - Fee Relatedlegal-statusCriticalCurrent
Adjusted expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

A method to dynamically switch between data transfer techniques includes receiving an I/O request and computing a cost of executing the I/O request using a memory copy data transfer technique. The memory copy data transfer technique copies cache segments associated with the I/O request from cache memory to a permanently mapped memory, which is permanently mapped to a bus address window. The method also computes a cost of executing the I/O request using a memory mapping data transfer technique. The memory mapping data transfer technique temporarily maps cache segments associated with the I/O request from the cache memory to the bus address window. The method uses one of the memory copy data transfer technique and the memory mapping data transfer technique to transfer cache segments associated with the I/O request, depending on which one is less costly. A corresponding system and computer program product are also disclosed.

Description

BACKGROUNDField of the Invention
This invention relates to systems and methods for dynamically switching between memory copy and memory mapping techniques to optimize I/O performance in storage systems.
Background of the Invention
A Peripheral Component Interconnect (PCI) host bridge may enable communication between a processor and an input/output (I/O) subsystem within a data processing system. The PCI host bridge provides data buffering capabilities to enable read and write data to be transferred between the processor and the I/O subsystem. The I/O subsystem may be a group of PCI devices connected to a PCI bus. When a PCI device on the PCI bus originates a read or write command to a system memory via a direct memory access (DMA), the PCI host bridge translates a PCI address of the DMA to a system memory address of the system memory.
Each PCI device on the PCI bus may be associated with a corresponding translation control entry (TCE) table resident within the system memory. The TCE table may be utilized to perform TCE translations from PCI addresses to system memory addresses. In response to a DMA read or write operation, a corresponding TCE table is read by the PCI host bridge to provide a TCE translation.
In storage systems such as the IBM DS8000™ enterprise storage system, each I/O that is processed by the storage system requires mapping cache memory of the storage system one or more times. For example, a read hit to the cache memory requires creation of a TCE mapping so that a host adapter can read the cache memory via a DMA. This TCE mapping is then unmapped after the DMA is complete. In the case of a read miss, two TCE mappings are required: one mapping between the cache memory and a device adapter in order to retrieve the read data from storage drives, and a second mapping between the cache memory and a host adapter in order to return the read data to a host system. After the DMAs are complete, the TCE mappings may be unmapped.
In view of the foregoing, what are needed are alternative data transfer techniques for transferring data within storage systems such as the IBM DS8000™ enterprise storage system. Further needed are systems and methods to dynamically switch between several data transfer techniques to optimize I/O performance in storage systems such as the IBM DS8000™ enterprise storage system.
SUMMARY
The invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available systems and methods. Accordingly, embodiments of the invention have been developed to dynamically switch between memory copy and memory mapping data transfer techniques to improve I/O performance. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
Consistent with the foregoing, a method is disclosed to dynamically switch between memory copy and memory mapping data transfer techniques to improve I/O performance. The method receives an I/O request and computes a cost of executing the I/O request using a memory copy data transfer technique. The memory copy data transfer technique copies cache segments associated with the I/O request from cache memory to a permanently mapped memory, which is permanently mapped to a bus address window. The method also computes a cost of executing the I/O request using a memory mapping data transfer technique. The memory mapping data transfer technique temporarily maps cache segments associated with the I/O request from the cache memory to the bus address window. The method uses one of the memory copy data transfer technique and the memory mapping data transfer technique to transfer cache segments associated with the I/O request, depending on which one is less costly.
A corresponding system and computer program product are also disclosed and claimed herein.
BRIEF DESCRIPTION OF THE DRAWINGS
In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through use of the accompanying drawings, in which:
FIG. 1 is a high-level block diagram showing one example of a network environment in which systems and methods in accordance with the invention may be implemented;
FIG. 2 is a high-level block diagram showing one embodiment of a storage system for use in the network environment ofFIG. 1;
FIG. 3 is a high-level block diagram showing one example of a memory mapping data transfer technique;
FIG. 4 is a high-level block diagram showing one example of a memory copy data transfer technique;
FIG. 5 is a flow diagram showing one embodiment of a method for determining which data transfer technique to use for a particular I/O request;
FIG. 6 is a high-level block diagram showing “mapping” windows allocated for use with a memory mapping data transfer technique and “copy” windows allocated for use with a memory copy data transfer technique;
FIG. 7 is a high-level block diagram showing dynamically adjusting a number of “mapping” windows and a number of “copy” windows to promote efficiency when processing I/O requests;
FIG. 8 is a flow diagram showing one embodiment of a method for optimizing a number of “mapping” windows used in association with a memory mapping data transfer technique, and a number of “copy” windows used in association with a memory copy data transfer technique;
FIG. 9 is a flow diagram showing another embodiment of a method for optimizing a number of “mapping” windows used in association with a memory mapping data transfer technique, and a number of “copy” windows used in association with a memory copy data transfer technique; and
FIG. 10 is a flow diagram showing one embodiment of a method for determining whether to utilize a memory mapping data transfer technique or a memory copy data transfer technique to process an I/O request.
DETAILED DESCRIPTION
It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the invention, as represented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
The present invention may be embodied as a system, method, and/or computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage system, a magnetic storage system, an optical storage system, an electromagnetic storage system, a semiconductor storage system, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage system via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
The computer readable program instructions may execute entirely on a user's computer, partly on a user's computer, as a stand-alone software package, partly on a user's computer and partly on a remote computer, or entirely on a remote computer or server. In the latter scenario, a remote computer may be connected to a user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention may be described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring toFIG. 1, one example of anetwork environment100 is illustrated. Thenetwork environment100 is presented to show one example of an environment where systems and methods in accordance with the invention may be implemented. Thenetwork environment100 is presented by way of example and not limitation. Indeed, the systems and methods disclosed herein may be applicable to a wide variety of different network environments in addition to thenetwork environment100 shown.
As shown, thenetwork environment100 includes one ormore computers102,106 interconnected by anetwork104. Thenetwork104 may include, for example, a local-area-network (LAN)104, a wide-area-network (WAN)104, theInternet104, anintranet104, or the like. In certain embodiments, thecomputers102,106 may include bothclient computers102 and server computers106 (also referred to herein as “hosts”106 or “host systems”106). In general, theclient computers102 initiate communication sessions, whereas theserver computers106 wait for and respond to requests from theclient computers102. In certain embodiments, thecomputers102 and/orservers106 may connect to one or more internal or external direct-attached storage systems112 (e.g., arrays of hard-storage drives, solid-state drives, tape drives, etc.). Thesecomputers102,106 and direct-attachedstorage systems112 may communicate using protocols such as ATA, SATA, SCSI, SAS, Fibre Channel, or the like.
Thenetwork environment100 may, in certain embodiments, include astorage network108 behind theservers106, such as a storage-area-network (SAN)108 or a LAN108 (e.g., when using network-attached storage). Thisnetwork108 may connect theservers106 to one or more storage systems110, such asarrays110aof hard-disk drives or solid-state drives,tape libraries110b, individual hard-disk drives110cor solid-state drives110c, tape drives110d, CD-ROM libraries, or the like. To access a storage system110, ahost system106 may communicate over physical connections from one or more ports on thehost106 to one or more ports on the storage system110. A connection may be through a switch, fabric, direct connection, or the like. In certain embodiments, theservers106 and storage systems110 may communicate using a networking standard such as Fibre Channel (FC) or iSCSI.
Referring toFIG. 2, one example of astorage system110acontaining an array of hard-disk drives204 and/or solid-state drives204 is illustrated. The internal components of thestorage system110aare shown since systems and methods in accordance with the invention may be implemented within such astorage system110a. As shown, thestorage system110aincludes astorage controller200, one ormore switches202, and one or more storage drives204, such as hard-disk drives204 and/or solid-state drives204 (e.g., flash-memory-based drives204). Thestorage controller200 may enable one or more host systems106 (e.g., open system and/ormainframe servers106 running operating systems such z/OS, zVM, or the like) to access data in the one or more storage drives204.
In selected embodiments, thestorage controller200 includes one ormore servers206a,206b. Thestorage controller200 may also includehost adapters208 anddevice adapters210 to connect thestorage controller200 to hostsystems106 and storage drives204, respectively.Multiple servers206a,206bmay provide redundancy to ensure that data is always available toconnected host systems106. Thus, when oneserver206afails, theother server206bmay pick up the I/O load of the failedserver206ato ensure that I/O is able to continue between thehost systems106 and the storage drives204. This process may be referred to as a “failover.”
In selected embodiments, each server206 includes one ormore processors212 andmemory214. Thememory214 may include volatile memory (e.g., RAM) as well as non-volatile memory (e.g., ROM, EPROM, EEPROM, hard disks, flash memory, etc.). The volatile and non-volatile memory may, in certain embodiments, store software modules that run on the processor(s)212 and are used to access data in the storage drives204. These software modules may manage all read and write requests to logical volumes in the storage drives204.
In certain embodiments, thememory214 includes acache216, such as aDRAM cache216. Whenever a host106 (e.g., an open system or mainframe server) performs a read operation for data that is not resident incache216, the server206 that performs the read may fetch data from the storage drives204 and save it in itscache216 in the event it is needed again. If the data is requested again by ahost system106, the server206 may fetch the data from thecache216 instead of fetching it from the storage drives204, saving both time and resources. Similarly, when ahost system106 performs a write, the server206 that receives the write request may store the modified data in itscache216, and destage the modified data to the storage drives204 at a later time.
One example of astorage system110ahaving an architecture similar to that illustrated inFIG. 2 is the IBM DS8000™ enterprise storage system. The DS8000™ is a high-performance, high-capacity storage controller providing disk and solid-state storage that is designed to support continuous operations. Nevertheless, the techniques disclosed herein are not limited to the IBM DS8000™enterprise storage system110a, but may be implemented in any comparable or analogous storage system110, regardless of the manufacturer, product name, or components or component names associated with the system110. Any storage system that could benefit from one or more embodiments of the invention is deemed to fall within the scope of the invention. Thus, the IBM DS8000™ is presented only by way of example and not limitation.
Referring toFIG. 3, in general, a Peripheral Component Interconnect (PCI) host bridge may enable communication between a processor and an input/output (I/O) subsystem within a data processing system. The PCI host bridge may provide data buffering capabilities to enable read and write data to be transferred between the processor and the I/O subsystem. The I/O subsystem may be a group of PCI devices (host adapters and/or device adapters) connected to a PCI bus. When a PCI device on the PCI bus originates a read or write command to a system memory via a direct memory access (DMA), the PCI host bridge may translate a PCI address of the DMA to a system memory address of the system memory.
Each PCI device on a PCI bus may be associated with a corresponding translation control entry (TCE)mapping302 resident within thesystem memory214. TheTCE mappings302 may be utilized to perform TCE translations from PCI addresses to system memory addresses. In response to a DMA read or write operation, a corresponding TCE mapping is read by the PCI host bridge to provide a TCE translation.
In storage systems such as the IBM DS8000™ enterprise storage system, each I/O that is processed by the storage system110 requiresmapping cache memory216 of the storage system110 one or more times. For example, a read hit to thecache memory216 requires creation of aTCE mapping302 so that ahost adapter208 can read thecache memory216 via a DMA. ThisTCE mapping302 is then unmapped after the DMA is complete. In the case of a read miss, two TCE mappings are required: onemapping302 between thecache memory216 and adevice adapter210 in order to retrieve the read data from storage drives204, and asecond mapping302 between thecache memory216 and ahost adapter208 in order to return the read data to ahost system106. After the DMAs are complete, theTCE mappings302 may be unmapped.
TCE mapping and unmapping may be costly in terms of time, especially with high I/O rates. One way to circumvent the need forTCE mappings302 is to keep certain portions ofcache memory216 permanently mapped (i.e., use dedicated permanently mapped memory). When an I/O arrives, requested data may be copied from thecache memory216 to this permanently mappedmemory400. The DMA may then occur from this permanently mappedmemory400 without needing to perform a TCE mapping/unmapping. This technique eliminates the cost (e.g., time needed) to perform the TCE mapping/unmapping, but introduces the cost (e.g., time needed) to copy data from one memory location to another. This cost may depend on where the two memory locations are relative to one another. Sometimes the cost may be less to perform a TCE mapping/unmapping and other times the cost may be less to copy data to permanently mappedmemory400.
In view of the foregoing, systems and methods are needed to dynamically switch between memory copy and memory mapping data transfer technique to optimize I/O performance in storage systems such as the IBM DS8000™ enterprise storage system. Ideally, depending on the I/O operation involved, such systems and methods will utilize the data transfer technique (i.e., memory copy or memory mapping) that is most efficient.
FIG. 3 is a high-level block diagram showing one example of a memory mapping data transfer technique, such as TCE mapping. As shown, acache216 may include one ormore cache segments300, such as fourkilobyte cache segments300. In certain embodiments, a data element such as a “track” may be made up ofmultiple cache segments300, such as seventeencache segments300. Thus, where a track is made up of seventeencache segments300 of four kilobytes each, the track may contain sixty-eight kilobytes of data. In many cases, thecache segments300 associated with a track may not be contiguous in thecache216. That is, thecache segments300 of the track may be sporadically or randomly located in different locations in thecache216. Thus, to read or write a track (i.e., a contiguous sequence of cache segments300) in thecache216, the track may need to be mapped tocorresponding cache segments300. In certain embodiments, a mapping302 (e.g., a TCE mapping302) may mapcache segments300 associated with the track to a bus address window304 so that ahost adapter208 and/ordevice adapter210 may transfer the track to/from thecache216 via DMA. In certain embodiments, themapping302 may order thecache segments300 in the order they are arranged in the track, as shown inFIG. 3.
FIG. 4 is a high-level block diagram showing an example of a memory copy data transfer technique. As shown, instead of mappingcache segments300 to a bus address window304, a memory copy data transfer technique may first copycache segments300 associated with a data element (e.g., track) to a permanently mappedmemory400. The permanently mappedmemory400 may reside in the same memory214 (e.g., memory chip) as thecache216 or in a different memory214 (e.g., memory chip). Thus, copying thecache segments300 from thecache216 to the permanently mappedmemory400 may have some cost, the magnitude of which may vary in accordance with the locations of thecache216 and the permanently mappedmemory400 and the time needed to copy data therebetween. In certain embodiments, the copiedcache segments300 may be ordered in the permanently mappedmemory400 in the same way they exist in the track, thereby providing a contiguous ordered group ofcache segments300 that can be transferred via DMA by ahost adapter208 and/ordevice adapter210.
Referring toFIG. 5, a flow diagram showing one embodiment of amethod500 for determining which data transfer technique to use with respect to a particular I/O request is illustrated. Thismethod500 may be performed each time an I/O request is received by the storage system110. As shown, themethod500 initially receives502 an I/O request. Themethod500 then computes504 a cost associated with executing the I/O request using a memory mapping data transfer technique, such as the memory mapping data transfer technique described inFIG. 3. In certain embodiments, the cost may be calculated504 by analyzing past statistics to determine how long it typically takes to map and unmap a particular track of data.
Themethod500 then computes506 the cost of executing the I/O request using a memory copy data transfer technique, such as the memory copy data transfer technique described in association withFIG. 4. In certain embodiments, the cost associated with using the memory copy data transfer technique is calculated by determining a number ofcache segments300 to copy to the permanently mappedmemory400. In certain embodiments, the memory copy data transfer technique may be used to copy less than a full track of data whereas the memory mapping data transfer technique may need to map a full track ofcache segments300. Thus, the memory copy data transfer technique may be more efficient with smaller transfers (e.g., less than a full track of data) than the memory mapping data transfer technique. The cost associated with the memory copy data transfer technique may also depend on the relative locations of thecache216 and the permanently mappedmemory400. If thecache216 and permanently mappedmemory400 are located on the same memory chip, for example, the cost may be less since the time to copy the data may be shorter. On the other hand, if thecache216 and permanently mappedmemory400 are located on different memory chips, the cost may be more since the time required to copy the data may be longer.
Themethod500 then compares508 the cost of the memory mapping data transfer technique to the cost of the memory copy data transfer technique. If the cost of the memory mapping data transfer technique is larger, themethod500 may use510, if possible, the memory copy data transfer technique to transfer data associated with the I/O request to/from thecache216 to ahost adapter208 and/ordevice adapter210. If, on the other hand, the cost of the memory copy data transfer technique is larger, themethod500 may use512, if possible, the memory mapping data transfer technique to transfer data associated with the I/O request to/from thecache216 to ahost adapter208 and/ordevice adapter210. As will be explained in more detail in association withFIG. 10, use of either the memory mapping or memory copy data transfer technique may depend on whether “mapping” windows or “copy” windows are available to transfer the data. A more detailed embodiment of a method for performing steps510 and512 ofFIG. 5 will be described in association withFIG. 10.
Referring toFIG. 6, in certain embodiments, a specified number of “mapping”windows600 may be allocated for transferring data using the memory mapping data transfer technique, and a specified number of “copy”windows602 may be allocated for transferring data using the memory copy data transfer technique. Each “mapping” window may provide a bus address window304 for transferring data using the memory mapping data transfer technique, and each “copy” window may provide a bus address window304 for transferring data using the memory copy data transfer technique. As was previously mentioned, a bus address window304 may provide a way for ahost adapter208 and/ordevice adapter210 to read or write a certain amount of contiguous storage space (e.g., a track) on an address bus.
For example, assume that a total of two thousand windows are initially allocated for transferring data and, of these two thousand windows, one thousand are “mapping” windows and the other thousand are “copy” windows. The “mapping” windows may be used to service I/O requests for which the memory mapping data transfer technique is deemed more efficient, and the “copy” windows may be used to service I/O requests for which the memory copy data transfer technique is deemed more efficient. If a certain number of “copy” windows and “mapping” windows are initially allocated for transferring data, systems and methods in accordance with the invention may dynamically adjust the respective number of windows that are allocated to each data transfer technique in accordance with incoming I/O requests. For example, if not enough “copy” windows are available to service incoming I/O requests that are identified to use the memory copy data transfer technique, more of the total windows may be allocated to “copy”windows602 and less of the total windows may be allocated to “mapping”windows600, as shown inFIG. 7. In this way, the number of “copy” windows and the number of “mapping” windows may be dynamically changed to correspond to incoming I/O requests.
Referring toFIG. 8, one embodiment of amethod800 for allocating windows and dynamically changing the allocation of windows is illustrated. As shown, themethod800 initially allocates802 a first number of “copy” windows to be used in association with the memory copy data transfer technique and a second number of “mapping” windows to be used in association with the memory mapping data transfer technique. In certain embodiments, allocating the windows may include allocating a certain amount ofmemory214 to implement the windows. For example, two gigabytes ofmemory214 may be allocated to the windows, with one gigabyte allocated to themappings302 associated with the memory mapping data transfer technique, and one gigabyte allocated to the permanently mappedmemory400 associated with the memory copy data transfer technique.
In other embodiments, the allocation may include a total number of windows, with a certain proportion of the total windows being “mapping” windows and the remaining proportion of the total windows being “copy” windows. In certain embodiments, the total number of windows or the total amount ofmemory214 allocated to windows is fixed. In other embodiments, the total number of windows or the total amount ofmemory214 allocated to windows is adjusted as needed. The initial allocation of windows may be based on an estimate or guess of how many are needed or based on statistical data such as the type of I/O that has been received in the past.
Once a first number of “copy” windows and a second number of “mapping” windows have been allocated, themethod800 processes804 I/O requests over a period of time using, if possible, the most efficient data transfer technique to process the I/O requests. That is, if the memory mapping data transfer technique is deemed to be more efficient to process an I/O request, themethod800 ideally utilizes the memory mapping data transfer technique and an associated “mapping” window to process the I/O request. Similarly, if the memory copy data transfer technique is deemed to be more efficient to process an I/O request, themethod800 ideally utilizes the memory copy data transfer technique and an associated “copy” window to process the I/O request.
While processing the I/O requests, themethod800tracks806 the number of times that the memory copy data transfer technique was ideally utilized but was not available due to a lack of associated “copy” windows. Similarly, themethod800tracks808 the number of times that the memory mapping data transfer technique was ideally utilized but was not available due to a lack of associated “mapping” windows. Based on the number of times each type of window was unavailable, themethod800 dynamically changes810 the allocation of “copy” windows and “mapping” windows (e.g., changes the number of “copy” windows relative to the number of “mapping” windows, or increases/decreases the number of “copy” windows and/or “mapping” windows). This may be performed with the goal of minimizing the number of times that windows of a certain type are needed but unavailable.
Referring toFIG. 9, another embodiment of amethod900 for allocating windows and dynamically changing the allocation of windows is illustrated. As shown, themethod900 initially allocates902 a first number of “copy” windows to be used in association with the memory copy data transfer technique and a second number of “mapping” windows to be used in association with the memory mapping data transfer technique. Once a first number of “copy” windows and a second number of “mapping” windows are allocated, themethod900 processes904 I/O requests over a period of time using, if possible, the most efficient data transfer technique to process the I/O requests.
While the I/O requests are being processed, themethod900tracks906 the proportion of I/O requests that are of certain types. For example, themethod900 may track906 what proportion of the I/O requests are sequential I/O requests, large random I/O requests, and small random I/O requests. Sequential I/O requests and large random I/O requests are typically full track accesses and thus may be processed more efficiently using the memory mapping data transfer technique. Small random I/O requests, by contrast, may include less-than-full-track accesses and thus may be processed more efficiently using the memory copy data transfer technique. As was previously explained, the memory copy data transfer technique may be used to copy less than a full track of data whereas the memory mapping data transfer technique may need to map a full track ofcache segments300.
In accordance with the proportion of I/O requests that are of each type, themethod900 may dynamically adjust the number of “copy” windows and the number of “mapping” windows to conform to the composition and type of incoming I/O requests. This may assure, as much as possible, that the most efficient data transfer technique is selected and used for each incoming I/O request.
FIG. 10 is a flow diagram showing one embodiment of amethod1000 for determining whether to utilize a memory mapping data transfer technique or a memory copy data transfer technique to process an I/O request. In certain embodiments, thismethod1000 is used in place of steps510,512 illustrated inFIG. 5. As shown, themethod1000 initially determines1002, for a received I/O request, whether using the memory copy data transfer technique is less costly than using the memory mapping data transfer technique. If so, themethod1000 determines whether an available number of “copy” windows is below a threshold (e.g.,100), an available number of “mapping” windows is above a threshold (e.g.,100), and a cost difference between using the memory copy data transfer technique and using the memory mapping data transfer technique is below a threshold (e.g., 5 microseconds). If these conditions are satisfied, themethod1000 uses1004 the memory mapping data transfer technique to transfer data associated with the I/O request. In essence, this step1004 uses the memory mapping data transfer technique to transfer data associated with the I/O request if “copy” windows are in short supply, “mapping” windows are not in short supply, and the cost difference between the data transfer techniques is not too large. Otherwise, themethod1000 proceeds to the next step1006.
At step1006, if no “copy” windows are available but at least one “mapping” window is available, themethod1000 uses1006 the memory mapping data transfer technique to transfer data associated with the I/O request regardless of the cost difference between using the memory copy data transfer technique and using the memory mapping data transfer technique. In essence, this step1006 uses the memory mapping data transfer technique to transfer data associated with the I/O request if it is the only option available, even if using the memory copy data transfer technique would be the most efficient. Otherwise, themethod1000 proceeds to the next step1008.
At step1008, if no “copy” windows and no “mapping” windows are available, themethod1000 waits1008 for the next available window (“copy” window or “mapping” window) and uses this window along with the corresponding data transfer technique to transfer the data associated with the I/O request. This is performed regardless of the cost difference between using the memory copy data transfer technique and using the memory mapping data transfer technique. Otherwise, themethod1000 proceeds to the next step1010. At step1010, themethod1000 uses the memory copy data transfer technique to transfer data associated with the I/O request since “copy” windows are available and using the memory copy data transfer technique is less costly than using the memory mapping data transfer technique.
If, atstep1002, the memory copy data transfer technique is not less costly than the memory mapping data transfer technique (meaning that the memory mapping data transfer technique is less costly than the memory copy data transfer technique), themethod1000 proceeds to step1012. At step1012, themethod1000 determines1012 whether an available number of “mapping” windows is below a threshold (e.g., 100), an available number of “copy” windows is above a threshold (e.g., 100), and a cost difference between using the memory mapping data transfer technique and using the memory copy data transfer technique is below a threshold (e.g., 5 microseconds). If these conditions are satisfied, themethod1000 uses1012 the memory copy data transfer technique to transfer data associated with the I/O request. In essence, this step1012 uses the memory copy data transfer technique to transfer data associated with the I/O request if “mapping” windows are in short supply, “copy” windows are not in short supply, and the cost difference between the data transfer techniques is not too large. Otherwise, themethod1000 proceeds to the next step1014.
At step1014, if no “mapping” windows are available but at least one “copy” window is available, themethod1000 uses1014 the memory copy data transfer technique to transfer data associated with the I/O request regardless of the cost difference between using the memory mapping data transfer technique and using the memory copy data transfer technique. In essence, this step1014 uses the memory copy data transfer technique to transfer data associated with the I/O request if it is the only option available, even if using the memory mapping data transfer technique would be more efficient. Otherwise, themethod1000 proceeds to the next step1016.
At step1016, if no “mapping” windows or “copy” windows are available, themethod1000 waits1016 for the next available window (“copy” window or “mapping” window) and uses this window along with the corresponding data transfer technique to transfer the data. This is performed regardless of the cost difference between the memory mapping data transfer technique and the memory copy data transfer technique. Otherwise, themethod1000 proceeds to the next step1018. At step1018, themethod1000 uses1018 the memory mapping data transfer technique to transfer data associated with the I/O request since “mapping” windows are available and using the memory mapping data transfer technique is less costly than using the memory copy data transfer technique.
The flowcharts and/or block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer-usable media according to various embodiments of the present invention. In this regard, each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims (20)

The invention claimed is:
1. A method for dynamically switching between memory copy and memory mapping data transfer techniques to improve I/O performance, the method comprising:
receiving an I/O request;
computing a cost of executing the I/O request using a memory copy data transfer technique, the memory copy data transfer technique copying cache segments associated with the I/O request from cache memory to a permanently mapped memory, wherein the permanently mapped memory is permanently mapped to a bus address window;
computing a cost of executing the I/O request using a memory mapping data transfer technique, the memory mapping data transfer technique temporarily mapping cache segments associated with the I/O request from the cache memory to the bus address window;
using the memory copy data transfer technique to transfer cache segments associated with the I/O request in the event using the memory copy data transfer technique is less costly than using the memory mapping data transfer technique; and
using the memory mapping data transfer technique to transfer cache segments associated with the I/O request in the event using the memory mapping data transfer technique is less costly than using the memory copy data transfer technique.
2. The method ofclaim 1, wherein the bus address window is a Peripheral Component Interconnect (PCI) bus address window.
3. The method ofclaim 1, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises calculating a number of cache segments to copy to the permanently mapped memory.
4. The method ofclaim 1, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises determining copy latency between the cache memory and the permanently mapped memory.
5. The method ofclaim 4, wherein determining the copy latency between the cache memory and the permanently mapped memory comprises determining locations of the cache memory and the permanently mapped memory.
6. The method ofclaim 1, wherein computing the cost of executing the I/O request using the memory mapping data transfer technique comprises estimating an amount of time needed to at least one of map and unmap cache segments associated with the I/O request from the cache memory to the bus address window.
7. The method ofclaim 1, wherein the cache segments are not all contiguous in the cache memory.
8. A computer program product for dynamically switching between memory copy and memory mapping data transfer techniques to improve I/O performance, the computer program product comprising a computer-readable medium having computer-usable program code embodied therein, the computer-usable program code configured to perform the following when executed by at least one processor:
receive an I/O request;
compute a cost of executing the I/O request using a memory copy data transfer technique, the memory copy data transfer technique copying cache segments associated with the I/O request from cache memory to a permanently mapped memory, wherein the permanently mapped memory is permanently mapped to a bus address window;
compute a cost of executing the I/O request using a memory mapping data transfer technique, the memory mapping data transfer technique temporarily mapping cache segments associated with the I/O request from the cache memory to the bus address window;
use the memory copy data transfer technique to transfer cache segments associated with the I/O request in the event using the memory copy data transfer technique is less costly than using the memory mapping data transfer technique; and
use the memory mapping data transfer technique to transfer cache segments associated with the I/O request in the event using the memory mapping data transfer technique is less costly than using the memory copy data transfer technique.
9. The computer program product ofclaim 8, wherein the bus address window is a Peripheral Component Interconnect (PCI) bus address window.
10. The computer program product ofclaim 8, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises calculating a number of cache segments to copy to the permanently mapped memory.
11. The computer program product ofclaim 8, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises determining copy latency between the cache memory and the permanently mapped memory.
12. The computer program product ofclaim 11, wherein determining the copy latency between the cache memory and the permanently mapped memory comprises determining locations of the cache memory and the permanently mapped memory.
13. The computer program product ofclaim 8, wherein computing the cost of executing the I/O request using the memory mapping data transfer technique comprises estimating an amount of time needed to at least one of map and unmap cache segments associated with the I/O request from the cache memory to the bus address window.
14. The computer program product ofclaim 8, wherein the cache segments are not all contiguous in the cache memory.
15. A system for dynamically switching between memory copy and memory mapping data transfer techniques to improve I/O performance, the system comprising:
at least one processor;
at least one memory device coupled to the at least one processor and storing instructions for execution on the at least one processor, the instructions causing the at least one processor to:
receive an I/O request;
compute a cost of executing the I/O request using a memory copy data transfer technique, the memory copy data transfer technique copying cache segments associated with the I/O request from cache memory to a permanently mapped memory, wherein the permanently mapped memory is permanently mapped to a bus address window;
compute a cost of executing the I/O request using a memory mapping data transfer technique, the memory mapping data transfer technique temporarily mapping cache segments associated with the I/O request from the cache memory to the bus address window;
use the memory copy data transfer technique to transfer cache segments associated with the I/O request in the event using the memory copy data transfer technique is less costly than using the memory mapping data transfer technique; and
use the memory mapping data transfer technique to transfer cache segments associated with the I/O request in the event using the memory mapping data transfer technique is less costly than using the memory copy data transfer technique.
16. The system ofclaim 15, wherein the bus address window is a Peripheral Component Interconnect (PCI) bus address window.
17. The system ofclaim 15, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises calculating a number of cache segments to copy to the permanently mapped memory.
18. The system ofclaim 15, wherein computing the cost of executing the I/O request using the memory copy data transfer technique comprises determining copy latency between the cache memory and the permanently mapped memory.
19. The system ofclaim 18, wherein determining the copy latency between the cache memory and the permanently mapped memory comprises determining locations of the cache memory and the permanently mapped memory.
20. The system ofclaim 15, wherein computing the cost of executing the I/O request using the memory mapping data transfer technique comprises estimating an amount of time needed to at least one of map and unmap cache segments associated with the I/O request from the cache memory to the bus address window.
US16/567,7472019-09-112019-09-11Dynamically switching between memory copy and memory mapping to optimize I/O performanceExpired - Fee RelatedUS11016692B2 (en)

Priority Applications (6)

Application NumberPriority DateFiling DateTitle
US16/567,747US11016692B2 (en)2019-09-112019-09-11Dynamically switching between memory copy and memory mapping to optimize I/O performance
JP2022515916AJP7495191B2 (en)2019-09-112020-09-03 Dynamically switching between memory copy and memory mapping to optimize I/O performance
DE112020003721.5TDE112020003721B4 (en)2019-09-112020-09-03 DYNAMIC SWITCHING BETWEEN MEMORY COPY AND MEMORY IMAGE DATA TRANSFER TECHNIQUES TO IMPROVE I/O PERFORMANCE
CN202080049597.2ACN114127699B (en)2019-09-112020-09-03 Dynamically switch between memory copy and memory mapping to optimize I/O performance
GB2203249.4AGB2602404B (en)2019-09-112020-09-03Dynamically switching between memory copy and memory mapping to optimize 1/O performance
PCT/IB2020/058197WO2021048709A1 (en)2019-09-112020-09-03Dynamically switching between memory copy and memory mapping to optimize 1/o performance

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US16/567,747US11016692B2 (en)2019-09-112019-09-11Dynamically switching between memory copy and memory mapping to optimize I/O performance

Publications (2)

Publication NumberPublication Date
US20210072918A1 US20210072918A1 (en)2021-03-11
US11016692B2true US11016692B2 (en)2021-05-25

Family

ID=74849768

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US16/567,747Expired - Fee RelatedUS11016692B2 (en)2019-09-112019-09-11Dynamically switching between memory copy and memory mapping to optimize I/O performance

Country Status (6)

CountryLink
US (1)US11016692B2 (en)
JP (1)JP7495191B2 (en)
CN (1)CN114127699B (en)
DE (1)DE112020003721B4 (en)
GB (1)GB2602404B (en)
WO (1)WO2021048709A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11860773B2 (en)2022-02-032024-01-02Micron Technology, Inc.Memory access statistics monitoring

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20070239905A1 (en)*2006-03-092007-10-11Banerjee Dwip NMethod and apparatus for efficient determination of memory copy versus registration in direct access environments
US8966133B2 (en)*2012-11-162015-02-24International Business Machines CorporationDetermining a mapping mode for a DMA data transfer
US9432298B1 (en)2011-12-092016-08-30P4tents1, LLCSystem, method, and computer program product for improving memory systems
US10268583B2 (en)2012-10-222019-04-23Intel CorporationHigh performance interconnect coherence protocol resolving conflict based on home transaction identifier different from requester transaction identifier
US20190171581A1 (en)2006-12-062019-06-06Longitude Enterprise Flash S.A.R.L.Systems and methods for storage parallelism

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4965717A (en)*1988-12-091990-10-23Tandem Computers IncorporatedMultiple processor system having shared memory with private-write capability
US6931457B2 (en)*2002-07-242005-08-16Intel CorporationMethod, system, and program for controlling multiple storage devices
US7644239B2 (en)*2004-05-032010-01-05Microsoft CorporationNon-volatile memory cache performance improvement
JP2007079715A (en)2005-09-122007-03-29Fuji Xerox Co LtdData transfer method, program and device
US20120036302A1 (en)2010-08-042012-02-09International Business Machines CorporationDetermination of one or more partitionable endpoints affected by an i/o message
US8719523B2 (en)*2011-10-032014-05-06International Business Machines CorporationMaintaining multiple target copies
US8799588B2 (en)*2012-02-082014-08-05International Business Machines CorporationForward progress mechanism for stores in the presence of load contention in a system favoring loads by state alteration
US8943251B2 (en)*2012-05-142015-01-27Infineon Technologies Austria AgSystem and method for processing device with differentiated execution mode
US9430163B1 (en)*2015-12-152016-08-30International Business Machines CorporationImplementing synchronization for remote disk mirroring
US10318417B2 (en)*2017-03-312019-06-11Intel CorporationPersistent caching of memory-side cache content
CN109522102B (en)*2018-09-112022-12-02华中科技大学 A method for processing multi-task external memory pattern graph based on I/O scheduling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20070239905A1 (en)*2006-03-092007-10-11Banerjee Dwip NMethod and apparatus for efficient determination of memory copy versus registration in direct access environments
US20190171581A1 (en)2006-12-062019-06-06Longitude Enterprise Flash S.A.R.L.Systems and methods for storage parallelism
US9432298B1 (en)2011-12-092016-08-30P4tents1, LLCSystem, method, and computer program product for improving memory systems
US10268583B2 (en)2012-10-222019-04-23Intel CorporationHigh performance interconnect coherence protocol resolving conflict based on home transaction identifier different from requester transaction identifier
US8966133B2 (en)*2012-11-162015-02-24International Business Machines CorporationDetermining a mapping mode for a DMA data transfer

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Anonymously, "A Method to Improve Network Switch Performance of Large Data Blocks in Storage Network," IP.com No. IPCOM000254570D, Jul. 12, 2018.
Anonymously, "Method for enhanced application performance during storage migrations in multi-tier storage environment," IP.com No. IPCOM000254599D, Jul. 17, 2018.
EMC Corporation, "EMC VPLEX: Elements of Performance and Testing Best Practices Defined," EMC White Paper, available at: https://www.emc.com/collateral/white-papers/h11299-emc-vplex-elements-performance-testing-best-practices-wp.pdf, 2012.
Lee, Shin-Ying, "Intelligent Scheduling and Memory Management Techniques for Modern GPU Architectures," a Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy, Arizona State University, Jul. 2017.
McDavitt, Ben, et al., "System for Supporting a Container-Based Application on a Storage System" IP.com No. IPCOM000257716D, Mar. 5, 2019.

Also Published As

Publication numberPublication date
GB202203249D0 (en)2022-04-20
DE112020003721B4 (en)2025-02-06
CN114127699B (en)2024-06-25
GB2602404B (en)2022-11-09
JP7495191B2 (en)2024-06-04
US20210072918A1 (en)2021-03-11
DE112020003721T5 (en)2022-05-25
CN114127699A (en)2022-03-01
WO2021048709A1 (en)2021-03-18
JP2022547684A (en)2022-11-15
GB2602404A (en)2022-06-29

Similar Documents

PublicationPublication DateTitle
US12405741B2 (en)Dynamic data relocation using cloud based ranks
US8782335B2 (en)Latency reduction associated with a response to a request in a storage system
US10459652B2 (en)Evacuating blades in a storage array that includes a plurality of blades
US10721304B2 (en)Storage system using cloud storage as a rank
US20140281123A1 (en)System and method for handling i/o write requests
US10581969B2 (en)Storage system using cloud based ranks as replica storage
US11086535B2 (en)Thin provisioning using cloud based ranks
US9459800B2 (en)Storage region metadata management
US10621059B2 (en)Site recovery solution in a multi-tier storage environment
US11188425B1 (en)Snapshot metadata deduplication
US11016692B2 (en)Dynamically switching between memory copy and memory mapping to optimize I/O performance
US10942857B1 (en)Dynamically adjusting a number of memory copy and memory mapping windows to optimize I/O performance
US10120578B2 (en)Storage optimization for write-in-free-space workloads
US11048667B1 (en)Data re-MRU to improve asynchronous data replication performance
JP2023523144A (en) Preemptive staging for full stride destage
US10705905B2 (en)Software-assisted fine-grained data protection for non-volatile memory storage devices
US11379427B2 (en)Auxilary LRU list to improve asynchronous data replication performance
US11314691B2 (en)Reserved area to improve asynchronous data replication performance
US10664188B2 (en)Data set allocations taking into account point-in-time-copy relationships

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GUPTA, LOKESH M.;ASH, KEVIN J.;RINALDI, BRIAN A.;AND OTHERS;SIGNING DATES FROM 20190905 TO 20190909;REEL/FRAME:050345/0019

FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPPInformation on status: patent application and granting procedure in general

Free format text:PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPPInformation on status: patent application and granting procedure in general

Free format text:PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCFInformation on status: patent grant

Free format text:PATENTED CASE

FEPPFee payment procedure

Free format text:MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPSLapse for failure to pay maintenance fees

Free format text:PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCHInformation on status: patent discontinuation

Free format text:PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FPLapsed due to failure to pay maintenance fee

Effective date:20250525


[8]ページ先頭

©2009-2025 Movatter.jp