CROSS-REFERENCE TO RELATED APPLICATION(S)This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/098,530 filed Dec. 31, 2014, and the subject matter thereof is incorporated herein by reference thereto.
TECHNICAL FIELDEmbodiments relate generally to a computing system, and more particularly to a system with distribute compute-enabled storage.
BACKGROUNDModern consumer and industrial electronics, such as computing systems, servers, appliances, televisions, cellular phones, automobiles, satellites, and combination devices, are providing increasing levels of functionality to support modern life. These devices are more interconnected. Storage of information is becoming more of a necessity.
Research and development in the existing technologies can take a myriad of different directions. Storing information locally or over a distributed network is becoming more important. Processing efficiency and inputs/outputs between storage and computing resources are more problematic as the amount of data, computation, and storage increases.
Thus, a need still remains for a computing system with distributed compute-enabled storage group for ubiquity of storing and retrieving information regardless of the source of data or the request for the data, respectively. In view of the ever-increasing commercial competitive pressures, along with growing consumer expectations and the diminishing opportunities for meaningful product differentiation in the marketplace, it is increasingly critical that answers be found to these problems. Additionally, the need to reduce costs, improve efficiencies and performance, and meet competitive pressures adds an even greater urgency to the critical necessity for finding answers to these problems.
Solutions to these problems have been long sought but prior developments have not taught or suggested any solutions and, thus, solutions to these problems have long eluded those skilled in the art.
SUMMARYAn embodiment provides an apparatus, including: a storage device configured to perform in-storage processing with formatted data based on application data from an application; and return an in-storage processing output to the application for continued execution.
An embodiment provides a method including: performing in-storage processing with a storage device with formatted data based on application data from an application; and returning an in-storage processing output from the storage device to the application for continued execution.
Certain embodiments of the invention have other steps or elements in addition to or in place of those mentioned above. The steps or elements will become apparent to those skilled in the art from a reading of the following detailed description when taken with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 is a computing system with distributed compute-enabled storage group in an embodiment of the present invention.
FIG. 2 is an example of an architectural view of a computing system with a distributed compute-enabled storage device.
FIG. 3 is an example of an operational view for a split function of the data preprocessor.
FIG. 4 is an example of an operational view for a split+padding function of the data preprocessor.
FIG. 5 is an example of an operational view for a split+redundancy function of the data preprocessor.
FIG. 6 is an example of an operational view for a mirroring function of the data preprocessor.
FIG. 7 is an example of an architectural view of the output coordinator.
FIGS. 8A and 8B are detailed examples of an operational view of the split and split+padding functions.
FIG. 9 is an example of an architectural view of the computing system in an embodiment.
FIG. 10 is an example of an architectural view of the computing system in a further embodiment.
FIG. 11 is an example of an architectural view of the computing system in a yet further embodiment.
FIG. 12 is an example of an operational view of the computing system issuing device requests for in-storage processing in a centralized coordination model.
FIG. 13 is an example of an operational view of the computing system issuing device requests for in-storage processing in a decentralized coordination model.
FIG. 14 is an operational view for the computing system of a centralized coordination model.
FIG. 15 is an operational view for a computing system in a decentralized model in an embodiment with one output coordinator
FIG. 16 is an operational view of a computing system in a decentralized model in an embodiment with multiple output coordinators.
FIG. 17 is an example of a flow chart for the request distributor and the data preprocessor.
FIG. 18 is an example of a flow chart for a mirroring function for centralized and decentralized embodiments.
FIG. 19 is a flow chart of a method of operation of a computing system in an embodiment of the present invention.
DETAILED DESCRIPTIONVarious embodiments provide a computing system for efficient distributed processing by providing methods and apparatus for performing in-storage processing with multiple storage devices with capabilities for performing in-storage processing of the application data. An execution of an application can be shared by distributing the execution among various storage devices in a storage device. Each of the storage devices can perform in-storage processing with the application data as requested by an application request.
Various embodiments provide a computing system to reduce overall system power consumption by reducing the number of inputs/outputs between the application execution and the storage device. This reduction is achieved by having the storage devices perform in-storage processing instead of mere storage, read, and re-store by the application. Instead, the in-storage processing outputs can be returned as an aggregated output from the various storage devices that performed the in-storage processing, back to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
Various embodiments provide a computing system that reduces total cost of ownership by providing formatting and translation functions for the application data for different configurations or organizations of the storage device. Further, the computing system also provides translation for the in-storage processing to be carried out by the various storage devices as part of the storage group. Examples of types of translation or formatting include split, split+padding, split+redundancy, and mirroring.
Various embodiments provide a computing system that also minimizes integration by allowing the storage devices to handle more of the in-storage processing coordination functions, with less being done by the host execution the application. Another embodiment allows for the in-storage processing coordination to increasingly be located and operate outside of both the host and the storage devices.
Various embodiments provide a computing system with more efficient execution of the application with less interrupts to the application by coordinating the outputs of the in-storage processing from the storage devices. The output coordination can buffer the in-storage processing outputs and can also sort the order of each of the in-storage processing outputs before returning an aggregated output to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
Various embodiments provide a computing system further minimizing integration obstacles by allowing the storage devices in the storage group to have different or the same functionalities. As an example, one of the storage devices can function as the only output coordinator for all the in-storage processing outputs from the other storage devices. As a further example, the aggregation function can be distributed amongst the storage devices, passing along from storage device to storage device and performing partial aggregation at each storage device, until a final one of the storage devices returns the full aggregated output back to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
The following embodiments are described in sufficient detail to enable those skilled in the art to make and use the invention. It is to be understood that other embodiments may be evident based on the present disclosure, and that system, process, architectural, or mechanical changes can be made to the embodiments as examples without departing from the scope of the present invention.
In the following description, numerous specific details are given to provide a thorough understanding of the invention. However, it will be apparent that the invention and various embodiments may be practiced without these specific details. In order to avoid obscuring an embodiment of the present invention, some well-known circuits, system configurations, and process steps are not disclosed in detail.
The drawings showing embodiments of the system are semi-diagrammatic, and not to scale and, particularly, some of the dimensions are for the clarity of presentation and are shown exaggerated in the drawing figures. Similarly, although the views in the drawings for ease of description generally show similar orientations, this depiction in the figures is arbitrary for the most part. Generally, an embodiment can be operated in any orientation.
The term “module” referred to herein can include software, hardware, or a combination thereof in an embodiment of the present invention in accordance with the context in which the term is used. For example, the software can be machine code, firmware, embedded code, application software, or a combination thereof. Also for example, the hardware can be circuitry, processor, computer, integrated circuit, integrated circuit cores, a pressure sensor, an inertial sensor, a microelectromechanical system (MEMS), passive devices, or a combination thereof. Additional examples of hardware circuitry can be digital circuits or logic, analog circuits, mixed-mode circuits, optical circuits, or a combination thereof. Further, if a module is written in the apparatus claims section below, the modules are deemed to include hardware circuitry for the purposes and the scope of apparatus claims.
The modules in the following description of the embodiments can be coupled to one another as described or as shown. The coupling can be direct or indirect without or with, respectively, intervening between coupled items. The coupling can be physical contact or by communication between items.
Referring now toFIG. 1, therein is shown acomputing system100 with a data protection mechanism in an embodiment of the present invention. Thecomputing system100 is depicted inFIG. 1 as a functional block diagram of thecomputing system100 with adata storage system101. The functional block diagram depicts thedata storage system101 installed in ahost computer102.
Various embodiments can include thecomputing system100 with devices for storage, such as asolid state disk110, anon-volatile memory112, hard disk drives116,memory devices117, and network attachedstorage122. These devices for storage can include capabilities to perform in-storage processing, that is, to independently perform relatively complex computations at a location outside of a traditional system CPU. As part of the in-storage processing paradigm, various embodiments of the present inventive concept manage the distribution of data, the location of data, and the location of processing tasks for in-storage processing. Further, these in-storage computing enabled storage devices can be grouped or clustered into arrays. Various embodiments manage the allocation of data and/or processing based on the architecture and capabilities of these devices or arrays. In-storage processing is further explained later.
As an example, thehost computer102 can be as a server or workstation. Thehost computer102 can include at least acentral processing unit104,host memory106 coupled to thecentral processing unit104, and ahost bus controller108. Thehost bus controller108 provides ahost interface bus114, which allows thehost computer102 to utilize thedata storage system101.
It is understood that the function of thehost bus controller108 can be provided bycentral processing unit104 in some implementations. Thecentral processing unit104 can be implemented with hardware circuitry in a number of different manners. For example, thecentral processing unit104 can be a processor, an application specific integrated circuit (ASIC) an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), a field programmable gate array (FPGA), or a combination thereof.
Thedata storage system101 can be coupled to asolid state disk110, such as a non-volatile memory based storage group having a peripheral interface system, or anon-volatile memory112, such as an internal memory card for expanded or extended non-volatile system memory.
Thedata storage system101 can also be coupled to hard disk drives (HDD)116 that can be mounted in thehost computer102, external to thehost computer102, or a combination thereof. Thesolid state disk110, thenon-volatile memory112, and thehard disk drives116 can be considered as direct attached storage (DAS) devices, as an example.
Thedata storage system101 can also support a network attachport118 for coupling to anetwork120. Examples of thenetwork120 can include a personal area network (PAN), a local area network (LAN), a storage area network (SAN), a wide area network (WAN), or a combination thereof. The network attachport118 can provide access to network attached storage (NAS)122. The network attachport118 can also provide connection to and from thehost bus controller108.
While the network attachedstorage122 are shown as hard disk drives, this is an example only. It is understood that the network attachedstorage122 could include any non-volatile storage technology, such as magnetic tape storage (not shown), storage devices similar to thesolid state disk110,non-volatile memory112, orhard disk drives116 that are accessed through the network attachport118. Also, the network attachedstorage122 can include aggregated resources, such as just a bunch of disks (JBOD) systems or redundant array of intelligent disks (RAID) systems as well as other network attachedstorage122.
Thedata storage system101 can be attached to thehost interface bus114 for providing access to and interfacing to multiple of the direct attached storage (DAS) devices via acable124 for storage interface, such as Serial Advanced Technology Attachment (SATA), the Serial Attached SCSI (SAS), or the Peripheral Component Interconnect—Express (PCI-e) attached storage devices.
Thedata storage system101 can include astorage engine115 andmemory cache117. Thestorage engine115 can be implemented with hardware circuitry, software, or a combination thereof in a number of ways. For example, thestorage engine115 can be implemented as a processor, an application specific integrated circuit (ASIC), an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), FPGA, or a combination thereof.
Thecentral processing unit104 or thestorage engine115 can control the flow and management of data to and from thehost computer102, and from and to the direct attached storage (DAS) devices, the network attachedstorage122, or a combination thereof. Thestorage engine115 can also perform data reliability check and correction, which will be further discussed later. Thestorage engine115 can also control and manage the flow of data between the direct attached storage (DAS) devices and the network attachedstorage122 and amongst themselves. Thestorage engine115 can be implemented in hardware circuitry, a processor running software, or a combination thereof.
For illustrative purposes, thestorage engine115 is shown as part of thedata storage system101, although thestorage engine115 can be implemented and partitioned differently. For example, thestorage engine115 can be implemented as part of in thehost computer102, implemented in software, implemented in hardware, or a combination thereof. Thestorage engine115 can be external to thedata storage system101. As examples, thestorage engine115 can be part of the direct attached storage (DAS) devices described above, the network attachedstorage122, or a combination thereof. The functionalities of thestorage engine115 can be distributed as part of thehost computer102, the direct attached storage (DAS) devices, the network attachedstorage122, or a combination thereof. Thecentral processing unit104 or some portion of it can also be in thedata storage system101, the direct attached storage (DAS) devices, the network attachedstorage122, or a combination thereof.
Thememory devices117 can function as a local cache to thedata storage system101, thecomputing system100, or a combination thereof. Thememory devices117 can be a volatile memory or a nonvolatile memory. Examples of the volatile memory can include static random access memory (SRAM) or dynamic random access memory (DRAM).
Thestorage engine115 and thememory devices117 enable thedata storage system101 to meet the performance requirements of data provided by thehost computer102 and store that data in thesolid state disk110, thenon-volatile memory112, the hard disk drives116, or the network attachedstorage122.
For illustrative purposes, thedata storage system101 is shown as part of thehost computer102, although thedata storage system101 can be implemented and partitioned differently. For example, thedata storage system101 can be implemented as a plug-in card in thehost computer102, as part of a chip or chipset in thehost computer102, as partially implement in software and partially implemented in hardware in thehost computer102, or a combination thereof. Thedata storage system101 can be external to thehost computer102. As examples, thedata storage system101 can be part of the direct attached storage (DAS) devices described above, the network attachedstorage122, or a combination thereof. Thedata storage system101 can be distributed as part of thehost computer102, the direct attached storage (DAS) devices, the network attachedstorage122, or a combination thereof.
Referring now toFIG. 2, therein is shown an architectural view of acomputing system100 with a distributed compute-enabled storage device. The architectural view can depict an example of relationships between some parts in thecomputing system100. As an example, the architectural view can depict thecomputing system100 to include anapplication202, an in-storage processing coordinator204, and astorage group206.
As an example, thestorage group206 can be partitioned in thecomputing system100 ofFIG. 1 in a number of ways. For example, thestorage group206 can be part of or distributed among thedata storage system101 ofFIG. 1, thehard disk drives116 ofFIG. 1, the network attachedstorage122 ofFIG. 1, thesolid state disk110 ofFIG. 1, thenon-volatile memory112 ofFIG. 1, or a combination thereof.
Theapplication202 is a process executing a function. Theapplication202 can provide an end-user (not shown) function or other functions related to the operation, control, usage, or communication of thecomputing system100. As an example, theapplication202 can be a software application executed by a processor, a central processing unit (CPU), a programmable hardware state machine, or other hardware circuitry that can execute software code from the software application. As a further example, theapplication202 can be a function executed purely in hardware circuitry, such as logic gates, finite state machine (FSM), transistors, or a combination thereof. Theapplication202 can execute on thecentral processing unit104 ofFIG. 1.
The in-storage processing coordinator204 manages the communication and activities between theapplication202 and thestorage group206. The in-storage processing coordinator204 can manage the operations between theapplication202 and thestorage group206. As an example, the in-storage processing coordinator204 can translate information between theapplication202 and thestorage group206. Also for example, the in-storage processing coordinator204 can—direct information flow and assignments between theapplication202 and thestorage group206. As an example, the in-storage processing coordinator204 can include adata preprocessor208, arequest distributor210, and anoutput coordinator212.
As an example, the in-storage processing coordinator204 or portions of it can be executed by thecentral processing unit104 or other parts of thehost computer102. The in-storage processing coordinator204 or portions of it can also be executed by thedata storage system101. As a specific example, thestorage engine115 ofFIG. 1 can execute the in-storage processing coordinator204 or portions of it. Thehard disk drives116 ofFIG. 1, the network attachedstorage122 ofFIG. 1, thesolid state disk110 ofFIG. 1, thenon-volatile memory112 ofFIG. 1, or a combination thereof can execute the in-storage processing coordinator204 or portions of it.
Thedata preprocessor208 performs data formatting ofapplication data214 and placement of formatteddata216. Theapplication data214 is the information or data generated by theapplication202. The formatting can enable storing theapplication data214 as formatteddata216 acrossmultiple storage devices218 for in-storage processing (ISP) to be stored in thestorage group206.
In-storage processing refers to the processing or manipulation of the formatteddata216 to be sent back to theapplication202 or the system executing theapplication202. The in-storage processing is more than mere storing and retrieval of the formatteddata216. Examples of the manipulation or processing as part of the in-storage processing can include integer or floating point math operations, Boolean operations, reorganization of data bits or symbols, or a combination thereof. Other examples of manipulating or processing as part of the in-storage processing can include search, sort, compares, filtering, combining the formatteddata216, theapplication data214, or a combination thereof.
As a further example, thedata preprocessor208 can format theapplication data214 from theapplication202 and generate the formatteddata216 to be processed outside or independent from execution of theapplication202. This independent processing can be performed with the in-storage processing. Theapplication data214 can be independent of and not necessarily the same format as those stored in thestorage group206. The format of theapplication data214 can be different than the formatteddata216, which will be described later.
Depending on the type of theapplication data214, array configurations of thestorage group206, or other user-defined policies, theapplication data214 can be processed in various ways. As an example, the policies can refer to availability requirements so as to affect the array configuration, such as mirroring, of thestorage group206. As a further example, the policies can refer to performance requirements as to further affect the array configuration, such as striping, of thestorage group206.
As examples of translation, theapplication data214 can be translated to the formatteddata216 using various methods, such as split, split+padding, split+redundancy, and mirroring. These methods can create independent data sets of the formatteddata216 that can be distributed tomultiple storage devices218, allowing for concurrent in-storage processing. The concurrent in-storage processing refers to each of thestorage devices218 in thestorage group206 being able to independently process or operate on the formatteddata216, theapplication data214, or a combination thereof. This independent processing or operation can be independent of the execution of theapplication202, theother storage devices218 of thestorage group206 that received some of the formatteddata216 from theapplication data214, or a combination thereof.
Therequest distributor210 manages application requests220 between theapplication202 and thestorage group206. As a specific example, therequest distributor210 accepts the application requests220 from theapplication202 and distributes them. The application requests220 are actions between theapplication202 and thestorage group206 based on the in-storage processing. For example, the application requests220 can provide information from theapplication202 to be off-loaded to thestorage group206 for in-storage processing. Furthering the example, the results of the in-storage processing can be returned to theapplication202 based on the application requests220.
As an example, therequest distributor210 manages the application requests220 from theapplication202 for in-storage processing, for write or storage, or for output. Therequest distributor210 also distributes the application requests220 from theapplication202 across themultiple storage devices218 in thestorage group206.
As another example, incoming application requests220 for in-storage processing can be split into multiplesub-application requests222 to perform in-storage processing according to a distribution of the formatteddata216, organization of thestorage group206, or other policies. Therequest distributor210 can perform this split of theapplication request220 for in-storage processing based on the placement scheme for theapplication data214, the formatteddata216, or a combination thereof.
Example types of data placement schemes include a centralized scheme and decentralized scheme, discussed fromFIGS. 9 to 11. In various embodiments in a centralized scheme, thedata preprocessor208 is placed inside the in-storage processing coordinator204, while a decentralized model places thedata preprocessor208 inside thestorage group206.
For the embodiments with a centralized scheme, once the in-storage processing coordinator204 receives anapplication request220 such as a data write request with required information, as an example, address, data, data length, and a logical boundary, from theapplication202, therequest distributor210 provides thedata preprocessor208 with the required information such as data, data length, and logical boundary. Then, thedata preprocessor208 partitions the data into multiple data chunks of an appropriate size based on the store unit information. Then, therequest distributor210 distributes the corresponding data chunks to each of thestorage devices218 with multiplesub-application requests222. Thestorage group206, thestorage devices218, or a combination thereof can receive the application requests220, thesub-application requests222, or a combination thereof. On the other hand, therequest distributor210 in a decentralized model divides the data into a predefined size of chunks, for instance, data size/N, where N is the number of storage devices, and then distributes the chunks of data into each of thestorage devices218 withsub-application requests222 combined with the required information such as address, data length, and a logical boundary. Then, thedata preprocessor208 insidestorage devices206 partitions the assigned data into smaller chunks based on the store unit information.
As a further specific example, for a write request for theapplication data214, given theapplication data214 to be written, its length, and an optional logical boundary of theapplication data214, therequest distributor210 can send the write request to thedata preprocessor208 so that it can determine how to distribute theapplication data214. Once data distribution is determined, therequest distributor210 can issue the write request to thestorage devices218 in thestorage group206. Thehost bus controller108 ofFIG. 8 or the network attachport118 ofFIG. 1 can be used to execute therequest distributor210 and issue the application requests220.
Continuing with the example, thestorage devices218 can perform the in-storage processing on the formatteddata216. Therequest distributor210 can process the request for output by forwarding the output request to the in-storage processing coordinator204, or as a specific example to theoutput coordinator212, to send in-storage processing outputs224 back to theapplication202 or the system executing theapplication202. The application can continue to execute with the in-storage processing outputs224. The in-storage processing outputs224 can be the results of the in-storage processing by thestorage group206 of the formatteddata216. The in-storage processing outputs224 are not a mere read-back or read of the formatteddata216 stored in thestorage group206.
Theoutput coordinator212 can manage processed data generated from each of themultiple storage devices218 of thestorage group206 and can send it back to theapplication202. As an example, theoutput coordinator212 collects the results or the in-storage processing outputs224 and provides them to theapplication202 orvarious applications202 or the system executing theapplication202. Theoutput coordinator212 will be described later.
Thecomputing system100 also can provide error handling capabilities. For example, when one or more of thestorage devices218 in thestorage group206 become inaccessible or has a slower performance, the application requests220 can fail, such as time-outs or non-completions. For better availability, thecomputing system100 can perform a number of actions.
The following are examples for the application requests220 for writes to thestorage group206. The in-storage processing coordinator204, or as a more specific example therequest distributor210, can maintain a request log that can be used to issue retries for the application requests220 that failed or were not completed. Also as an example, the in-storage processing coordinator204 can keep retrying the application requests220 to write theapplication data214. As a further example, the in-storage processing coordinator204 can report that status of the application requests220 to theapplication202.
The following are examples for the application requests220 for in-storage processing at thestorage group206. If one of thestorage devices218 in thestorage group206 includes a replica of theapplication data214, the formatteddata216, or a combination thereof as to thestorage device218 that was inaccessible, these application requests220 can be redirected to thestorage device218 with the replica. If error recovery is possible, the error recovery process can be executed prior to the previous failedapplication requests220 being reissued to the recoveredstorage device218. An example of the error recovery technique can be a redundant array of inexpensive disk (RAID) recovery with rebuilding astorage device218 that has been striped. As other examples, the in-storage processing coordinator204 can try the application requests220 that previously failed. The in-storage processing coordinator204 can also generate reports of failures even if the application requests220 are redirected, retried, and even eventually successful.
The in-storage processing coordinator204 or at least a portion of it can be implemented in a number of ways. As an example, the in-storage processing coordinator204 can be implemented with software, hardware circuitry, or a combination thereof. Examples of hardware circuitry can include a processor, an application specific integrated circuit (ASIC) an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), FPGA, or a combination thereof.
Referring now toFIG. 3, there is shown an example of an operational view for a split function of thedata preprocessor208 ofFIG. 2.FIG. 3 depicts thatapplication data214 as input to thedata preprocessor208 or more generally the in-storage processing coordinator204 ofFIG. 2.FIG. 3 depicts one example method of the data formatting performed by thedata preprocessor208 as mentioned inFIG. 2. In this example, the data formatting is a split function or a split scheme.FIG. 3 also depicts the formatteddata216 as the output of thedata preprocessor208.
In this example, the amount of theapplication data214 is shown to span atransfer length302. Thetransfer length302 refers to the amount of data or information sent by theapplication202 to thedata preprocessor208 or vice versa. Thetransfer length302 can be a fixed size or variable depending on what theapplication202 transfers for in-storage processing.
Also in this example, theapplication data214 can includeapplication units304. Theapplication units304 are fields within or portions of theapplication data214. Each of theapplication units304 can be fixed in size or can be variable. As an example, theapplication units304 can represent partitioned portion or chunks ofapplication data214.
As an example, the size of each of theapplication units304 can be the same across theapplication data214. Also as an example, the size of each of theapplication units304 across theapplication data214 can differ for different transfers of theapplication data214. Further for example, the size of each of theapplication units304 can vary within the same transfer or across transfers. Theapplication units304 can also vary in size depending on thedifferent applications202 sending theapplication data214. The number ofapplication units304 can vary or can be fixed. The number ofapplication units304 can vary for thesame application202 sending theapplication data214 or betweendifferent applications202.
FIG. 3 also depicts the formatteddata216 as the output of thedata preprocessor208. The formatteddata216 can include formatted units306 (FUs). The formattedunits306 are fields within that formatteddata216. In this example, each of the formattedunits306 can be fixed in size or can be variable. The size of the formattedunits306 can be the same for the formatteddata216 or different for transfers of the formatteddata216, or can vary within the same transfer or across transfers. The formattedunits306 can also vary in size depending on thedifferent applications202 sending the formatteddata216. The number of the formattedunits306 can vary or can be fixed. The number of the formattedunits306 can vary for thesame application202 sending the formatteddata216 or betweendifferent applications202.
FIG. 3 depicts the formatteddata216 after a split formatting with the formattedunits306 overlaid visually with theapplication units304. An example storage application for this split formatting or split scheme can be with redundant array of inexpensive disk (RAID) systems as thestorage group206, or with at least some of themultiple storage devices218 in thestorage group206. The in-storage processing, or even the mere storage of theapplication data214, can at least involve splitting theapplication units304 to different destination devices in thestorage group206.
Continuing with this example, thedata preprocessor208 can split theapplication data214 into a predefined fixed-length blocks referred to as the formattedunits306 and can give each block to one or more of themultiple storage devices218 of the in-storage processing in a round robin fashion, as an example. The split scheme can generate non-aligned data sets between theapplication data214 and the formatteddata216. As a specific example, thedata preprocessor208 can generate the non-alignment between theapplication units304 relative to the boundaries for the formattedunits306.
Further with this example,FIG. 3 depicts an alternating formatting or allocation of theapplication units304 to the different devices in thestorage group206. In this example, theapplication units304 are depicted as “Data1, “Data2”, “Data3”, and through “Data K”. The formattedunits306 are depicted as “FU1”, “FU2”, and through “FU N”.
As a specific example, the formatteddata216 can have alternating instances targeted for one device or another device in thestorage group206. In other words, for example, odd numbered “FUs” can be fordrive1 and even numbered “FUs” can be fordrive0. The overlay of theapplication units304 as “Data” is shown as not aligned with the boundaries of the “FU” andFIG. 3 depicts “Data2” and “Data K” being split between FU1 (drive1) and FU2 (drive0), again for this example.
As a further example, the formatteddata216 can also be stored on one of thestorage device218 as opposed to being partitioned or allocated to different instances of thestorage devices218 in thestorage group206. In this example, the formattedunits306 can be sized for a sector based on a physical block address or a logical block address on one of thestorage devices218 as a hard disk drive or a solid state disk drive.
As a specific example for the split function, therequest distributor210 can initially send up to N in-storage processing application requests220 to thestorage devices218 in thestorage group206. The term N is an integer number. Theapplication units304 that are not aligned with the formattedunits306, such as “Data2” and “Data K” in this figure and example, can undergo additional processing at thestorage devices218 with in-storage processing.
For example, thenon-aligned application units304 can be determined after initial processing of theapplication data214 by thehost computer102 ofFIG. 1, therequest distributor210, orother storage devices218. Thenon-aligned application units304 can by fetched by thehost computer102 or therequest distributor210 allowing thenon-aligned application units304 to be concurrently processed by thehost computer102, therequest distributor210, thestorage devices218 in thestorage group206, or a combination thereof. Thenon-aligned application units304 can also be fetched by thehost computer102 or therequest distributor210 such that thesenon-aligned application units304 can be written back to the devices for in-storage processing. Each of thestorage devices218 can send the results of the processednon-aligned application units304 to thehost computer102, therequest distributor210, or theother storage devices218 so thehost computer102 or theother storage devices218 can continue to process theapplication data214.
Referring now toFIG. 4, therein is shown an example of an operational view for a split+padding function of thedata preprocessor208 ofFIG. 2.FIG. 4 depicts theapplication data214 as input to thedata preprocessor208 as similarly described inFIG. 3.FIG. 4 also depicts the formatteddata216 as the output of thedata preprocessor208.FIG. 4 depicts one example method of the data formatting performed by thedata preprocessor208 as mentioned inFIG. 2. In this example, the data formatting is a split+padding function or split+padding scheme.
In this example, the split+padding function by thedata preprocessor208 addsdata pads402 to align theapplication units304 to the formattedunits306. The alignment of theapplication units304 and the formattedunits306 can allow therequest distributor210 ofFIG. 2 to send up to K independent in-storage processing application requests220 tomultiple storage devices218 ofFIG. 2 in thestorage group206 ofFIG. 2. The term K is an integer. In other words, the alignment allows for each of themultiple storage devices218 to perform in-storage processing of theapplication units304, the formattedunits306, or a combination thereof independently without requiring further formatting or processing required on the formatteddata216.
As a specific example, each of the formattedunits306 includes one of theapplication units304 plus one of thedata pads402. Each of thedata pads402 aligns each of theapplication units304 to the boundaries of each of the formattedunits306. Thedata pads402 can also provide other functions or include other information. For example, thedata pads402 can include error detection or error correction information, such as parity, ECC protection, meta-data, etc.
Thedata pads402 can be placed or located in a number of different locations within the formattedunits306. For example, one of thedata pads402 can be located at the end of one of theapplication units304 as shown inFIG. 4. Also for example, each of thedata pads402 can also be located at the beginning of each of the formattedunits306 and before each of theapplication units304. Further for example, each of thedata pads402 can be distributed, uniformly or non-uniformly, across each of the formattedunits306 and within each of theapplication units304.
As an example, the size of each of thedata pads402 can depend on the difference in size between each of theapplication units304 and each of the formattedunits306. Thedata pads402 can be the same for each of the formattedunits306 or can vary. Further for example, the term size can refer to the number of bits or symbols for the formattedunits306, theapplication units304, or a combination thereof. The term size can also refer to the transmission time, recording time, or a combination thereof for the formattedunits306, theapplication units304, or a combination thereof.
In this example, the size of theapplication data214 is shown to span thetransfer length302 as similarly described inFIG. 3. In this example, theapplication units304 are depicted as “Data1”, “Data2″”, “Data3”, and through “Data K”. In this example, the formattedunits306 are depicted as “FU1”, “FU2”, and through “FU N”.
FIG. 4 depicts the formatteddata216 after a split+padding formatting with the formattedunits306 overlaid visually with theapplication units304 and with thedata pads402. An example storage application for this split+padding formatting or scheme can be with use of redundant array of inexpensive disk (RAID) systems as thestorage group206 or with at least some of themultiple storage devices218 in thestorage group206. The in-storage processing or even the mere storage of theapplication data214 can at least involve splitting+padding theapplication units304 to different destination devices in thestorage group206.
Continuing with this example, thedata preprocessor208 can split theapplication data214 with thedata pads402 into a predefined length, and gives each length to one or more of thestorage devices218 for in-storage processing. A length can include any number of theapplication units304. As a specific example, the formatteddata216 of any length can be targeted for one or more of themultiple storage devices218 in thestorage group206.
Referring now toFIG. 5, therein is shown an example of an operational view for a split+redundancy function of thedata preprocessor208 ofFIG. 2.FIG. 5 depicts theapplication data214 as input to thedata preprocessor208 as similarly described inFIG. 3. Theapplication data214 can include theapplication units304.
FIG. 5 also depicts the formatteddata216 as the output of thedata preprocessor208 as similarly described inFIG. 3. The formatteddata216 can include the formattedunits306.
In this example, the split+redundancy function can process the aligned andnon-aligned application units304. The in-storage processing in each of thestorage devices218 ofFIG. 2 in thestorage group206 ofFIG. 2 can process the alignedapplication units304 separately, thenon-aligned application units304 separately, or both at the same time.
In this example, thedata preprocessor208 is performing the split+redundancy function or the split+redundancy scheme. As part of this function, the split+padding function can split theapplication data214 to formatteddata216 of fixed length, variable length, or a combination thereof.
Also part of the split+redundancy function is the redundancy function. For the redundancy function as the example, thedata preprocessor208 does not necessarily need to manipulate theapplication data214, theapplication units304, or a combination thereof that are non-aligned to the formattedunits306 as the split function described inFIG. 3. This is depicted as the first row of the formatteddata216 inFIG. 5 and isredundancy data502. The formatteddata216 generated from the split+redundancy function includes theredundancy data502.
As an example, theredundancy data502 can be an output of thedata preprocessor208 mapping theapplication data214, or as a more specific example theapplication units304 to the formatteddata216 and across the formattedunits306 even with some of theapplication units304 nonaligned with the formattedunits306. In other words, some of theapplication units304 fall within the boundary of one of the formattedunits306 and theseapplication units304 are considered aligned. Other instances of theapplication units304 traverses multiple instances of the formattedunits306 and theseapplication units304 are considered nonaligned. As a specific example, theapplication units304 depicted as “Data2” and “Data K” each span across two and adjacent instances of the formattedunits306.
Also as an example, the split+redundancy function can also perform the split+padding function to some of theapplication units304. Thedata preprocessor208 can store theapplication units304 that are not aligned to the formattedunits306. This is depicted in the second row of the formatteddata216 ofFIG. 5 and is an aligneddata504. For these particular,non-aligned application units304, thedata preprocessor208 can perform the split+padding function as described inFIG. 4 to form the aligneddata504. In the example depicted inFIG. 5, theapplication units304 “Data2” and “Data K” are not aligned to or traverses multiple instances of the formattedunits306. The aligneddata504 generated by thedata preprocessor208 includes thedata pads402 to these instances of thenonaligned application units304 in theredundancy data502.
In this example, the split+redundancy function allows the in-storage processing coordinator204 to send up to N+M requests to thestorage devices218 in thestorage group206. Both N and M are integers. N represents the number of formattedunits306 in theredundancy data502. M represents the additional formattedunits306 in the aligneddata504. For the in-storage processing in each of thestorage devices218, thenon-aligned application units304 in theredundancy data502 can be ignored.
Referring now toFIG. 6, therein is shown an example of an operational view for a mirroring function of thedata preprocessor208 ofFIG. 2.FIG. 6 depicts the formatteddata216 as the output of thedata preprocessor208 as similarly described inFIG. 3. The formatteddata216 ofFIG. 2 can include the formattedunits306 ofFIG. 3. Theapplication data214 ofFIG. 3 can be processed by thedata preprocessor208.
When theapplication data214 is mirrored in this example, at least some of thestorage devices218 ofFIG. 2 can receive all of theapplication data214, which are replicated, or also referred to as mirrored. Theapplication data214 that are replicated are referred to asreplica data602.FIG. 6 depicts themultiple storage devices218 as “Device1” through “Device r” for thereplica data602. Replicatedunits604 are theapplication units304 ofFIG. 3 that are replicated and are shown as “Data1”, “Data2”, “Data3” through “Data K” on “Device1” through “Device r”. One of thestorage devices218 can store theapplication data214 as the formatteddata216 for thatstorage device218. Some of theother storage devices218 can store thereplica data602 and the replicatedunits604.
In this example, thedata preprocessor208 does not manipulate theapplication units304 or theapplication data214 as a whole. However, thedata preprocessor208 can collect or store mirroring information and theapplication units304. Also, the in-storage processing coordinator204 can receive theapplication data214 or theapplication units304 from theapplication202 when processing for efficient, concurrent in-storage processing.
The in-storage processing coordinator204 or thedata preprocessor208 can perform the mirroring functions in a number of ways. As an example, the in-storage processing coordinator204 or thedata preprocessor208 can take into account factors for mirroring theapplication data214 to the formatteddata216. One factor is the number of target devices from themultiple storage devices218. Another factor is the size of theapplication data214, theapplication units304 ofFIG. 3, or a combination thereof. A further factor is the size of the formatteddata216, the formattedunits306, or a combination thereof.
Referring now toFIG. 7, therein is shown an example of an architectural view of theoutput coordinator212. As noted earlier, theoutput coordinator212 manages the in-storage processing outputs224 generated from each of themultiple storage devices218 of thestorage group206 and sends it back to theapplication202. Theoutput coordinator212 can manage the interaction with theapplication202 in a number of ways.
As an example, theoutput coordinator212 function can be described as anoutput harvest702, anoutput management704, and anoutput retrieval706. Theoutput harvest702 is a process for collecting the in-storage processing outputs224. For example, theoutput harvest702 can collect the in-storage processing outputs224 from each of thestorage devices218 and store them. The storage can be done locally where theoutput harvest702 is being executed. Also for example, theoutput harvest702 can collect the locations of the in-storage processing outputs224 in each of thestorage devices218.
The following are examples of various embodiments of how theoutput coordinator212, or as a specific example theoutput harvest702, can collect the in-storage processing outputs224 from thestorage devices218. As an example, theoutput coordinator212 can fetch the in-storage processing outputs224 or their locations from each of thestorage devices218 that performed the in-storage processing of theapplication data214 ofFIG. 2, the formatteddata216 ofFIG. 2, or a combination thereof.
As an example, theoutput coordinator212 can fetch the in-storage processing outputs224 in a number of ways. For example, theoutput coordinator212 can utilize a direct memory access (DMA) with thestorage devices218. DMA transfers are transfer mechanisms not requiring a processor or a computing resource to manage the actual transfer once the transfer is setup. As another example, theoutput coordinator212 can utilized a programmed input/output (PIO) with thestorage devices218. PIO transfers are transfer mechanism where a processor or computing resources manages the actual transfer of data and not just the setup and status collection at a termination of the transfer. As a further example, theoutput coordinator212 can utilize interface protocol commands, such as SATA vendor specific commands, PCIe, DMA, or Ethernet commands.
As an example, thestorage devices218 can send the in-storage processing outputs224 to theoutput coordinator212 in a number of ways. For example, theoutput coordinator212 can utilize the DMA or PIO mechanisms. The DMA can be a remote DMA (rDMA) whereby the transfer is a DMA process from memory of one computer (e.g. the computer running the application202) into that of another (e.g. one of thestorage devices218 for the in-storage processing) without involving either one's operating system or processor intervention for the actual transfer. As another example, theoutput coordinator212 can utilize interface protocol processes, such as background SATA connection or Ethernet.
Also for example, thestorage devices218 can send its respective in-storage processing outputs224 or their locations to theapplication202. This can be accomplished without the in-storage processing outputs224 passing through theoutput coordinator212. For this example, thestorage devices218 and theapplication202 can interact in a number of ways, such as DMA, rDMA, PIO, back SATA connection, or Ethernet.
Regarding theoutput management704, theoutput coordinator212 can manage the order of the outputs from thestorage devices218. Theoutput management704 manages the outputs based on multiple constraints, such as size of output, storage capacity ofoutput coordinator212, and types of the application requests220 ofFIG. 2. The outputs can be the in-storage processing outputs224. As an example, theoutput management704 can order the outputs based on various policies.
As a specific example, the outputs or the in-storage processing outputs224 for each of thesub-application requests222 ofFIG. 2 for in-storage processing can be stored in a sorted order by asub-request identification708 per arequest identification710 for the in-storage processing. Therequest distributor210 can transform theapplication request220 ofFIG. 2 into multiplesub-application requests222 with the formatteddata216 and distributes them to thestorage devices218.
After data processing in each of thestorage devices218, theoutput coordinator212 gathers the in-storage processing outputs224 from each of thestorage devices218. Theoutput coordinator212 may need to preserve the issuing order of application requests220, thesub-application requests222, or a combination thereof even though the in-storage processing outputs224 from thestorage devices218 can be delivered to theoutput coordinator212 in an arbitrary order because the data processing time of thestorage devices218 can be different.
As an example to implement this order, thestorage group206 ofFIG. 2 can assign a sequence number to each of the in-storage processing outputs224, where each of the in-storage processing outputs224 also can be composed of multiple sub-outputs. For these sub-outputs, thestorage group206 also assigns sequence numbers or sequence identifications. Once theoutput coordinator212 receives each of the in-storage processing outputs224 or sub-output data from each of thestorage devices218, it can maintain each output's sequence thereby sorting them by sequence numbers or identification. If the order of the in-storage processing outputs224 or the sub-outputs is not important forapplication202, theoutput coordinator212 can send the in-storage processing outputs224 in an out of order manner.
Therequest identification710 represents information that can be used to demarcate one of the application requests220 from another. Thesub-request identification708 represents information that can be used to demarcate one of thesub-application requests222 from another.
As an example, thesub-request identification708 can be unique or associated with a specific instance of therequest identification710. As a further example, thesub-request identification708 can be non-constrained to a specific instance of therequest identification710.
As a more specific example, theoutput coordinator212 can include andoutput buffer712. Theoutput buffer712 can store the in-storage processing outputs224 from thestorage devices218. Theoutput buffer712 can be implemented in a number of ways. For example, theoutput buffer712 can be a hardware implementation of a first-in first-out (FIFO) circuit or of a linked list structure. Also for example, theoutput buffer712 can be implemented with memory circuitry with the software providing the intelligence for the FIFO operations, such as pointers, status flags, etc.
Also as a specific example, the outputs or the in-storage processing outputs224 for each of thesub-application requests222 can be added to theoutput buffer712. The in-storage processing outputs224 can be fetched from theoutput buffer712 as long as the output for the desired instance of thesub-application requests222 is in theoutput buffer712. Thesub-request identification708 can be utilized to determine whether the associated in-storage processing output224 has been stored in theoutput buffer712. Therequest identification710 can also be utilized, such as an initial determination.
Continuing the example for various embodiments, theoutput coordinator212 can collect the in-storage processing output224 from thestorage devices218. To guarantee the data integrity of the in-storage processing outputs224, theoutput coordinator212 can maintain the sequence of each of the in-storage processing outputs224 or sub-outputs data in a correct order. For this, theoutput coordinator212 can utilize thesub-request identification708 or the request identification710 (e.g. if each of the in-storage processing output224 of each of thestorage devices218 also reuses the same identification as their output sequence number or output sequence identification). Since the processing times of each of thestorage devices218 can be different, theoutput coordinator212 can temporarily store each of the in-storage processing outputs224 or sub-output data intooutput buffer712 to make them all sequential (i.e., correct data order). If there exists any missing in-storage processing output224 or sub-output (that is, a hole in the sequence IDs), theapplication202 cannot get the output data until all the in-storage processing outputs224 are correctly collected in theoutput buffer712.
As a further specific example, the outputs or the in-storage processing outputs224 for each of thesub-application requests222 can be sent to theapplication202 without passing through theoutput coordinator212 or theoutput buffer712 in theoutput coordinator212. In this example, the in-storage processing outputs224 can be sent from thestorage devices218 without being stored before reaching theapplication202.
Regarding theoutput retrieval706, once the output or the in-storage processing outputs224 are known, theapplication202 can retrieve the in-storage processing outputs224 in a number of ways. In some embodiments, theoutput retrieval706 can include the in-storage processing outputs224 passing through theoutput coordinator212. In other embodiments, theoutput retrieval706 can include the in-storage processing outputs224 being sent to theapplication202 without passing through theoutput buffer712.
As an example, the outputs or the in-storage processing outputs224 can be passed from thestorage devices218 to theoutput coordinator212. Theoutput coordinator212 can store the in-storage processing outputs224 in theoutput buffer712. Theoutput coordinator212 can then send the in-storage processing outputs224 to theapplication202.
Also as an example, the outputs or the in-storage processing outputs224 can be passed from thestorage devices218 to theoutput coordinator212. Theoutput coordinator212 can send the in-storage processing outputs224 to therequest distributor210. Therequest distributor210 can send the in-storage processing outputs224 to theapplication202. In the example, theoutput buffer712 can be within theoutput coordinator212, therequest distributor210, or a combination thereof.
Further as an example, the outputs or the in-storage processing outputs224 can be passed from thestorage devices218 to theapplication202. In this example, this transfer is direct without the in-storage processing outputs224 to pass through theoutput coordinator212, therequest distributor210, or a combination thereof.
Theoutput coordinator212 can be implemented in a number of ways. For example, theoutput coordinator212 can be implemented in hardware circuitry, such as a processor, an application specific integrated circuit (ASIC) an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), FPGA, or a combination thereof. Also for example, theoutput coordinator212 can implemented with software. Further for example, theoutput harvest702, theoutput management704, theoutput retrieval706, or a combination thereof can be implemented with hardware circuitry, with the examples noted earlier, or by software.
Similarly therequest distributor210 can be implemented in a number of ways. For example, therequest distributor210 can be implemented in hardware circuitry, such as a processor, an application specific integrated circuit (ASIC) an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), FPGA, or a combination thereof. Also for example, theoutput coordinator212 can implemented with software.
Referring now toFIGS. 8A and 8B, therein are shown detailed examples of an operational view of the split and split+padding functions.FIGS. 8A and B depict embodiments for an in-storage processing (ISP)-aware RAID. Various embodiments can be applied to an array configuration for thestorage devices218 ofFIG. 2 or thestorage group206 ofFIG. 2. Examples of RAID functions include striping, mirroring, or a combination thereof
FIGS. 8A and 8B depict examples of theapplication data214 and theapplication units304. Theapplication units304 can be processed by the in-storage processing coordinator204 ofFIG. 2.FIGS. 8A and 8B each depicts one example.
The example inFIG. 8A depicts theapplication data214 undergoing the split function, similarly to the one described inFIG. 3. This depiction can also represent a striping function in a RAID application.
The example inFIG. 8B depicts theapplication data214 undergoing a split+padding function, similarly to the one described inFIG. 4. This depiction can also represent a striping function in a RAID application but for various embodiments providing the in-storage processing for split+padding function.
DescribingFIG. 8A, this part depicts the formatteddata216 and the formattedunits306. In this example, the formatteddata216 is split and sent to two of thestorage devices218. Each of the formattedunits306 includes one or more of theapplication units304, such as FU0 in DEV1, which can include AU0 and AU1. Theseapplication units304, such as AU0, AU1, AU2, AU4, etc., can be each entirely contained in one of the formattedunits306 or traverse or span across multiple formattedunits306, such as AU3, AU6, AU8, etc. As described inFIG. 3, some of theapplication units304 are aligned with the formattedunits306 while others are not.
In this example, there are shown10 of theapplication units304 being split into the formattedunits306 that are sent to two of thestorage devices218. In thisexample application units304 labeled as AU1, AU3, AU5, AU6, and AU8 are not aligned. Thesenon-aligned application units304 can be identified with in-storage processing and separately processed by host systems or cooperatively withother storage devices218. Therefore, the application requests220 ofFIG. 4 for in-storage processing can be serialized and more complex request coordination could be required.
DescribingFIG. 8B, this part depicts the formatteddata216 and the formattedunits306, as inFIG. 8A. As in the left-side, this example depicts the formatteddata216 being split in some form and sent to two of thestorage devices218. In this example, each of theapplication units304, such as AU0, AU1, AU2, AU3, etc., can be aligned with one of the formattedunits306 with one of thedata pads402, as similarly described inFIG. 4.
In this example for use in ISP-aware RAID, theapplication units304 is pre-processed and aligned by split+padding policy, allowing each of the application requests220 for in-storage processing to be independent. This independence can maximize the opportunity for efficient, concurrent processing since no additional phase of processing is required for the formattedunits306 with the alignedapplication units304, compared with the non-aligned units.
Referring now toFIG. 9, therein is shown an example of an architectural view of thecomputing system900 in an embodiment. Thecomputing system900 can be an embodiment of thecomputing system100 ofFIG. 1.
In this embodiment as an example,FIG. 9 depicts the in-storage processing coordinator904 in a centralized coordination model. In this model, the in-storage processing coordinator904 is separate from or external to thehost computer102 and thestorage devices218. The term separate and external represents that the in-storage processing coordinator904 is in a separate system to thehost computer102 and thestorage devices218 can be housed separate system housing.
In this example, thehost computer102 can be executing theapplication202 ofFIG. 2. Thehost computer102 can also provide file and object services. Further to this example, the in-storage processing coordinator904 can be included as part of thenetwork120 ofFIG. 1, thedata storage system101 ofFIG. 1, implemented external to thehost computer102, or a combination thereof. As previously described inFIG. 2 and other figures earlier, the in-storage processing coordinator904 can include therequest distributor910, thedata preprocessor908, and theoutput coordinator912.
Continuing with this example, each of thestorage devices218 performs the in-storage processing functions. Each of thestorage devices218 can include an in-storage processing engine922. The in-storage processing engine922 can perform the in-storage processing for itsrespective storage device218.
Thestorage devices218 can be located in a number of places within thecomputing system100. For example, thestorage devices218 can be located within thedata storage system101 ofFIG. 1, as part of thenetwork120 ofFIG. 1, thehard disk drive116 ofFIG. 1 or storage external to thehost computer102, or as part of the network attachedstorage122 ofFIG. 1.
In various embodiments in a centralized coordination model as in this example, the in-storage processing coordinator904 can function with thestorage devices218 in a number of ways. For example, thestorage devices218 can be configured to support various functions, such asRAID 0, 1, 2, 3, 4, 5, 6, and object stores.
The in-storage processing engine922 can be implemented in a number of ways. For example, in-storage processing engine922 can be implemented with software, hardware circuitry, or a combination thereof. Examples of hardware circuitry can include a processor, an application specific integrated circuit (ASIC) an embedded processor, a microprocessor, a hardware control logic, a hardware finite state machine (FSM), a digital signal processor (DSP), FPGA, or a combination thereof.
Referring now toFIG. 10, therein is shown an example of an architectural view of thecomputing system1000 in a further embodiment. Thecomputing system1000 can be an embodiment of thecomputing system100 ofFIG. 1.
In this embodiment as an example,FIG. 10 depicts the in-storage processing coordinator1004 in a centralized coordination model. In this model, the in-storage processing coordinator1004 is internal to thehost computer102. The term internal represents that the in-storage processing coordinator1004 is in the same system to thehost computer102 and is generally housed in the same system housing as thehost computer102. This embodiment also has the in-storage processing coordinator1004 as a separate from or external to thestorage devices218.
In this embodiment as an example, thehost computer102 can include the in-storage processing coordinator1004 as well as the file object services. In this example, thehost computer102 can execute theapplication202 ofFIG. 2. As previously described inFIG. 2 and other figures earlier, the in-storage processing coordinator1004 can include therequest distributor1010, thedata preprocessor1008, and theoutput coordinator1012.
Continuing with this example, each of thestorage devices218 performs the in-storage processing function. Each of thestorage devices218 can include an in-storage processing engine1022. The in-storage processing engine1022 can perform the in-storage processing for itsrespective storage device218.
Thestorage devices218 can be located in a number of places within thecomputing system100. For example, thestorage devices218 can be located within thedata storage system101 ofFIG. 1, as part of thenetwork120 ofFIG. 1, thehard disk drive116 ofFIG. 1 or storage external to thehost computer102 or as part of the network attachedstorage122 ofFIG. 1.
Various embodiments in a centralized model as in this example, the in-storage processing coordinator1004 can function with thestorage devices218 in a number of ways. For example, thestorage devices218 can be configured to support various functions, such asRAID 0, 1, 2, 3, 4, 5, 6, and object stores.
The in-storage processing engine1022 can be implemented in a number of ways. For example, in-storage processing engine1022 can be implemented with software, hardware circuitry, or a combination thereof. Examples of hardware circuitry can include similar examples as inFIG. 9. The functions for this embodiment will be described in detail later.
Referring now toFIG. 11, therein is shown an example of an architecture view of thecomputing system1100 in a yet further embodiment. Thecomputing system1100 can be an embodiment of thecomputing system100 ofFIG. 1.
In this embodiment as an example,FIG. 11 depicts the in-storage processing coordinator1104 in a decentralized coordination model. In this example, the in-storage processing coordinator1104 is partitioned between thehost computer102 and thestorage devices218. Additional examples of operational flow for this model are described inFIG. 15 and inFIG. 16.
As previously described inFIG. 2 and other figures earlier, the in-storage processing coordinator1104 can include therequest distributor1110, thedata preprocessor1108, or a combination thereof. In this embodiment as an example, thedata preprocessor1108 and at least a portion of therequest distributor1110 are internal to thehost computer102. The term internal represents that therequest distributor1110 and thedata preprocessor1108 are in the same system to thehost computer102 and housed in the system housing as thehost computer102.
Also, this embodiment has theoutput coordinator1112 and at least a portion of therequest distributor1110 separate or external to thehost computer102. As a specific example, this embodiment provides theoutput coordinator1112 and at least a portion of therequest distributor1110 as internal to thestorage devices218.
In this example, thehost computer102 can execute theapplication202 ofFIG. 2. Continuing with this example, each of thestorage devices218 performs the in-storage processing function. Each of thestorage devices218 can include an in-storage processing engine1122. The in-storage processing engine1122 can perform the in-storage processing for itsrespective storage device218.
Thestorage devices218 can be located in a number of places within thecomputing system100. For example, thestorage devices218 can be located within thedata storage system101 ofFIG. 1, as part of thenetwork120 ofFIG. 1, thehard disk drive116 ofFIG. 1 or storage external to thehost computer102 or as part of the network attachedstorage122 ofFIG. 1.
In various embodiments in a decentralized model as in this example, this partition of the in-storage processing coordinator1104 can function with thestorage devices218 in a number of ways. For example, thestorage devices218 can be configured to support various functions, such asRAID 1 and object stores.
The in-storage processing engine1122 can be implemented in a number of ways. For example, in-storage processing engine1122 can be implemented with software, hardware circuitry, or a combination thereof. Examples of hardware circuitry can include similar examples as inFIG. 9.
Referring now toFIG. 12, therein is shown an example of an operational view of thecomputing system100 for in-storage processing in a centralized coordination model.FIG. 12 can represent embodiments for the centralized coordination model described fromFIG. 9 orFIG. 10.
FIG. 12 depicts the in-storage processing coordinator204 and the interaction between therequest distributor210 and thedata preprocessor208 for the centralized coordination model.FIG. 12 also depicts theoutput coordinator212.FIG. 12 also depicts the in-storage processing coordinator204 interacting with thestorage devices218.
As an operational example,FIG. 12 depicts the in-storage processing coordinator204 issuing thedevice requests1202 for in-storage processing, such as write requests to thestorage devices218. Therequest distributor210 can receive the application requests220 ofFIG. 2 for writing theapplication data214. Therequest distributor210 can also receive adata address1204 as well as thetransfer length302 and alogical boundary1206 of theapplication units304. The data address1204 can represent the address for theapplication data214. Thelogical boundary1206 represents the length or size of each of theapplication units304.
Continuing with the example, therequest distributor210 can send information to the data preprocessor208 to translate theapplication data214 to the formatteddata216. Therequest distributor210 can also send thetransfer length302 for theapplication data214. Theapplication data214 can be sent to thedata preprocessor208 as theapplication units304 or the logical boundaries to theapplication units304.
Furthering the example, thedata preprocessor208 can translate theapplication data214 or theapplication units304 to generate the formatteddata216 or the formattedunits306 ofFIG. 3. Examples of the types of translation can be one of the methods described inFIG. 2 andFIG. 3 throughFIG. 6. Thedata preprocessor208 can return the formatteddata216 or the formattedunits306 to therequest distributor210. Therequest distributor210 can generate andissue device requests1202 for writes to thestorage devices218 based on the formatting policies and policy for storing or for in-storage processing of the formatteddata216 or the formattedunits306. The device requests1202 are based on the application requests220.
Further continuing with the example, each of thestorage devices218 can include an in-storage processing function or application and the in-storage processing engine922. Each of thestorage devices218 can receive thedevice requests1202 and at least a portion of the formatteddata216.
For illustrative purposes, althoughFIG. 12 depicts thedevice requests1202 being issued to all of thestorage devices218, it is understood that therequest distributor210 can operate differently. For example, thedevice requests1202 can be issued to some of thestorage devices218 and not necessarily to all of them. Also for example, thedevice requests1202 can be issued at different times or can be issued as part of the error handling examples as discussed inFIG. 2.
As a specific example for a centralized coordination model, the in-storage processing coordinator204 can receive all the application requests220 from theapplication202, can issue all thedevice requests1202 to thestorage devices218, or a combination thereof. Therequest distributor210 can send or distribute thedevice requests1202 tomultiple storage devices218 based on a placement scheme. Theoutput coordinator212 can collect and manage the in-storage processing outputs224 from thestorage devices218. Theoutput coordinator212 can then send the in-storage processing outputs224 to theapplication202 ofFIG. 2 as similarly described inFIG. 7.
Referring now toFIG. 13, therein is shown an example of an operational view of thecomputing system1300 issuing data write requests to thestorage devices1318 for in-storage processing in a decentralized coordination model. Thecomputing system1300 can include similarities to thecomputing system1100 ofFIG. 11.FIG. 13 depicts the in-storage processing coordinator1304 including therequest distributor1310 and thedata preprocessor1308.
BothFIG. 12 andFIG. 13 depict an example of an operational view ofcomputing system1300 in terms of storing data to thestorage devices218. That is, bothFIGS. 12 and 13 focus on how to efficiently store data across thestorage devices218 for in-storage processing.
FIG. 13 also depicts theoutput coordinator1312 and a portion of therequest distributor1310 in each of thedevices1318.FIG. 13 also depicts the in-storage processing coordinator1304 interacting with thedevices1318.
As an operational example,FIG. 13 depicts the in-storage processing coordinator1304 issuing thedevice requests1302 as write requests to thedevices1318. Therequest distributor1310 in the in-storage processing coordinator1304 can receive the application requests220 ofFIG. 2 for writing theapplication data214 ofFIG. 2. Therequest distributor1310 can also receive adata address1204 as well as thetransfer length302 ofFIG. 3 and the logical boundary of theapplication units304 ofFIG. 3. The data address1204 can represent the address for theapplication data214.
Continuing with the example, therequest distributor1310 can send information to thedata preprocessor1308 to translate theapplication data214 to the formatteddata216 ofFIG. 2. Therequest distributor1310 can also send thetransfer length302 for theapplication data214. Theapplication data214 can be sent as theapplication units304 or the logical boundaries to theapplication units304 to thedata preprocessor1308.
Furthering the example, thedata preprocessor1308 can translate theapplication data214 or theapplication units304 to generate the formatteddata216 or the formattedunits306 of FIG.3. Examples of the types of translation can be one of the methods described inFIG. 2 andFIG. 3 throughFIG. 6. Thedata preprocessor1308 can return the formatteddata216 or the formattedunits306 to therequest distributor1310 in the in-storage processing coordinator1304. Therequest distributor1310 can generate and issue the application requests220 for writes to thedevices1318 based on the formatting policies and policy for storing or for in-storage processing of the formatteddata216 or the formattedunits306.
Further continuing with the example, each of thedevices1318 can include an in-storage processing function or application and the in-storage processing engine1322. Each of thedevices1318 can receive thedevice requests1302 and at least a portion of the formatteddata216. Each of thedevices1318 can also include theoutput coordinator1312, a portion of therequest distributor1310, or a combination thereof.
For illustrative purposes, althoughFIG. 13 depicts thedevice requests1302 being issued to all of thedevices1318, it is understood that therequest distributor1310 can operate differently. For example, thedevice requests1302 can be issued to some of thedevices1318 and not necessarily to all of them. Also for example, thedevice requests1302 can be issued at different times or can be issued as part of the error handling examples as discussed inFIG. 2.
As a specific example for a decentralized coordination model, the in-storage processing coordinator1304 can receive the application requests220 from theapplication202, can issue thedevice requests1302 to thedevices1318, or a combination thereof. Therequest distributor1310 in the in-storage processing coordinator1304 can send or distribute thedevice requests1302 tomultiple devices1318 based on a placement scheme.
Continuing with the specific example, therequest distributor1310 in each of thedevices1318 can receive the request from the in-storage processing coordinator1304. Theoutput coordinator1312 can collect and manage the in-storage processing outputs224 from thedevices1318 or one of thedevices1318.
Also as a specific example for a decentralized coordination model, there are various communication methods depending on the configuration of thestorage group206. The functions of therequest distributor1310 and theoutput coordinator1312 in thedevices1318 in a decentralized coordination model will be described later.
Referring now toFIG. 14, therein is shown an operational view for thecomputing system100 for in-storage processing in a centralized model.FIG. 14 depicts the in-storage processing coordinator904 to be external to both thehost computer102 and thestorage devices218. Although theapplication202 is shown outside of thehost computer102, it is understood that theapplication202 can be executed by thehost computer102 as well as outside of thehost computer102. In addition, although the in-storage processing coordinator904 is external to the host inFIG. 14, it is also understood that the in-storage processing coordinator904 can be internal to the host, like inFIG. 10.
FIG. 14,FIG. 15, andFIG. 16 depict an example of an operational view ofcomputing system1300 ofFIG. 13 in terms of processing data in thestorage devices218. That is,FIGS. 14, 15, and 16 focus on how to efficiently process/compute the stored data in thestorage devices218 with in-storage processing techniques.
In this example, theapplication202 can issue application requests220 for in-storage processing to thehost computer102. Thehost computer102 can issuehost requests1402 based on the application requests220 from theapplication202. The host requests1402 can be sent to the in-storage processing coordinator904.
The in-storage processing coordinator904 can translate theapplication data214 ofFIG. 2 and theapplication units304 ofFIG. 3 to generate the formatteddata216 ofFIG. 2 and the formattedunits306 ofFIG. 3. The in-storage processing coordinator904 can also generate thedevice requests1202 to thestorage devices218. The in-storage processing coordinator904 can also collect and manage the in-storage processing outputs224 from thestorage devices218, and can deliver an aggregatedoutput1404 back to thehost computer102, theapplication202, or a combination thereof. The aggregatedoutput1404 is the combination of the in-storage processing outputs224 from thestorage devices218. The aggregatedoutput1404 can be more than concatenation of the in-storage processing outputs224.
As a specific example, the in-storage processing coordinator904 can include therequest distributor910. Therequest distributor910 can receive the application requests220 as the host requests1402. Therequest distributor910 can generate thedevice requests1202 from the host requests1402. Therequest distributor910 can also generate thesub-application requests222 ofFIG. 7 as the device requests1202.
As a further specific example, the in-storage processing coordinator904 can include thedata preprocessor908. Thedata preprocessor908 can receive the information from the application requests220 or thehost requests1402 through therequest distributor910. Thedata preprocessor908 can format theapplication data214 as appropriate based on the placement scheme onto thestorage devices218.
Also as a specific example, the in-storage processing coordinator904 can include theoutput coordinator912. Theoutput coordinator912 can receive the in-storage processing outputs224 from thestorage devices218. Theoutput coordinator912 can generate the aggregatedoutput1404 with the in-storage processing outputs224. In this example, theoutput coordinator912 can return the aggregatedoutput1404 to thehost computer102. Thehost computer102 can also return the aggregatedoutput1404 to theapplication202. Theapplication202 can continue to execute and utilize the in-storage outputs224, the aggregatedoutput1404, or a combination thereof.
In this example, each of thestorage devices218 includes the in-storage processing engine922. The in-storage processing engine922 can receive and operate on specific instance of the device requests1202. The in-storage processing engine922 can generate in-storage processing output224 to be returned to the in-storage processing coordinator904 or as a specific example to theoutput coordinator912.
Referring now toFIG. 15, therein is shown an operational view for acomputing system1500 in a decentralized model in an embodiment with oneoutput coordinator1512. Thecomputing system1500 can be thecomputing system1100 ofFIG. 11.
As an operational overview of this embodiment, thehost computer102 can issue anapplication request220 to thestorage devices218 for in-storage processing. Thehost computer102 and thestorage devices218 can be similarly partitioned as described inFIG. 11. Each of thestorage devices218 can perform the in-storage processing. Each of thestorage devices218 can provide its in-storage processing output224 to thestorage device218 that received theapplication request220 from thehost computer102. Thisstorage device218 can then return an aggregatedoutput1504back host computer102, theapplication202, or a combination thereof. Theapplication202 can continue to execute and utilize the in-storage outputs224, the aggregatedoutput1504, or a combination thereof.
Continuing with the example, theapplication request220 can be issued to one of thestorage devices218. That onestorage device218 can issue theapplication request220 or adevice request1202 to theother storage devices218. As an example, thestorage device218 that received theapplication request220 can decompose theapplication request220 to partition the in-storage processing to theother storage devices218. Thedevice request1202 can be that partitioned request based off theapplication request220 and the in-storage processing execution by theprevious storage devices218.
This example depicts a number of the devices labeled as “DEV_1”, “DEV_2”, “DEV_3”, and through “DEV_N”. The term “N” in the figure is an integer. Thestorage devices218 in this example can perform in-storage processing. Each of thestorage devices218 are shown including an in-storage processing engine1522, adata preprocessor1508, and anoutput coordinator1512.
For illustrative purposes, all of thestorage devices218 are shown with theoutput coordinator1512, although it is understood that thecomputing system1500 can partitioned differently. For example, only one of thestorage devices218 can include theoutput coordinator1512. Further for example, theoutput coordinator1512 in each of thestorage devices218 can operate differently from another. As a specific example, theoutput coordinator1512 in DEV_2 through DEV_N can act as pass through to thenext storage device218 or to return the in-storage processing output224 back to DEV_1. Each of thestorage devices218 can manage it requestidentification710 ofFIG. 7, thesub-request identification708 ofFIG. 7, or a combination thereof.
In this example, thehost computer102 can send theapplication request220 to one of thestorage devices218 labeled DEV_1. The in-storage processing engine1522 in DEV_1 can perform the appropriate level of in-storage processing and generates the in-storage processing output224. In this example, the in-storage processing output224 from DEV_1 can be referred to as afirst output1524.
Continuing with this example, thedata preprocessor1508 in DEV_1 can format or translate the information from theapplication request220 that will be forwarded to DEV_2, DEV_3, and through to DEV_N. The in-storage processing engine1522 in DEV_2 can generate the in-storage processing output224 and can be referred to asecond output1526. Theoutput coordinator1512 in the DEV_2 can send thesecond output1528 to DEV_1. The in-storage processing engine1522 in DEV_3 can generate the in-storage processing output224 and can be referred to athird output1528. Theoutput coordinator1512 in the DEV_3 can send thethird output1528 to DEV_1. The in-storage processing engine1522 in DEV_N can generate the in-storage processing output224 and can be referred to an Nth output. Theoutput coordinator1512 in the DEV_N can send the Nth output to DEV_1. Theoutput coordinator1512 in DEV_1 generates the aggregatedoutput1504 that includes thefirst output1524, thesecond output1526, thethird output1528, and through the Nth output.
Referring now toFIG. 16, therein is shown an operational view for acomputing system1600 in a decentralized model in an embodiment withmultiple output coordinators1612. Thecomputing system1600 can be thecomputing system1100 ofFIG. 11.
As an operational overview of this embodiment, thehost computer102 can issue anapplication request220 tostorage devices218 for in-storage processing. Thehost computer102 and thestorage devices218 can be similarly partitioned as described inFIG. 11. Theapplication request220 can be issued to one of thestorage devices218. Thatstorage device218 then performs the in-storage processing. The execution of theapplication request220 and the in-storage processing results is issued or sent to another of thestorage devices218. This process can continue until all thestorage devices218 performed the in-storage processing and the last of thestorage devices218 can return the result to the first of thestorage devices218. That first of thestorage devices218 then returns an aggregatedoutput1604back host computer102, theapplication202, or a combination thereof. Theapplication202 can continue to execute and utilize the in-storage outputs224 ofFIG. 2, the aggregatedoutput1604, or a combination thereof.
For illustrative purposes, this embodiment is described with DEV_1 providing the aggregatedoutput1604 to thehost computer102, although it is understood that this embodiment can operate differently. For example, the last device or DEV_N in this example can provide the aggregatedoutput1604 back to thehost computer102 instead of DEV_1.
This example depicts a number of thestorage devices218 labeled as “DEV_1”, “DEV_2”, “DEV_3”, and through “DEV_N”. The term “N” in the figure is an integer. Thestorage devices218 in this example can perform in-storage processing. Each of thestorage devices218 are shown including an in-storage processing engine1622, adata preprocessor1608, and anoutput coordinator1612.
For illustrative purposes, all of thestorage devices218 are shown with theoutput coordinator1612, although it is understood that thecomputing system1600 can partitioned differently. For example, only one of thestorage devices218 can include theoutput coordinator1612 with full functionality. Further for example, theoutput coordinator1612 in each of thestorage devices218 can operate differently from another. As a specific example, theoutput coordinator1612 in DEV_2 through DEV_N can act as pass through to thenext storage device218 or to return the aggregatedoutput1604 back to DEV_1.
In this example, thehost computer102 can send theapplication request220 to one of thestorage devices218 labeled DEV_1. The in-storage processing engine1622 in DEV_1 can perform the appropriate level of in-storage processing and can generate the in-storage processing output224. In this example, the in-storage processing output224 from DEV_1 can be referred to afirst output1624. In this example, the DEV_1 can decompose theapplication request220 to partition the in-storage processing to DEV_2. Thedevice request1202 ofFIG. 12 can be that partitioned request based off theapplication request220 and the in-storage processing execution DEV_1. This process of decomposing and partitioning can continue through DEV_N.
Continuing with this example, thedata preprocessor1608 in DEV_1 can format or translate the information from theapplication request220 that will be forwarded to DEV_2. Thedata preprocessor1608 in DEV_1 can also format or translate the in-storage processing output224 from DEV_1 or thefirst output1624.
Furthering this example, theoutput coordinator1612 in DEV_1 can send the output of thedata preprocessor1608 in DEV_1, thefirst output1624, a portion of theapplication request220, or a combination thereof to DEV_2. DEV_2 can continue the in-storage processing of theapplication request220 sent to DEV_1.
Similarly, the in-storage processing engine1622 in DEV_2 can perform the appropriate level of in-storage processing based on thefirst output1624 and can generate the in-storage processing output224 from DEV_2. In this example, the in-storage processing output224 from DEV_2 can be referred to asecond output1626 as “a partial aggregated output.”
Continuing with this example, thedata preprocessor1608 in DEV_2 can format or translate the information from theapplication request220 or thesecond output1626 that will be forwarded to DEV_3. Thedata preprocessor1608 in DEV_2 can also format or translate the in-storage processing output224 from DEV_2 or thesecond output1626.
Furthering this example, theoutput coordinator1612 in DEV_2 can send the output of thedata preprocessor1608 in DEV_2, thesecond output1626, a portion of theapplication request220, or a combination thereof to DEV_3. DEV_3 can continue the in-storage processing of theapplication request220 sent to DEV_1.
Similarly, the in-storage processing engine1622 in DEV_3 can perform the appropriate level of in-storage processing based on thesecond output1626 and can generate the in-storage processing output224 from DEV_3. In this example, the in-storage processing output224 from DEV_3 can be referred to athird output1628.
Continuing with this example, thedata preprocessor1608 in DEV_3 can format or translate the information from theapplication request220 or thethird output1628 that will be forwarded to DEV_1. Thedata preprocessor1608 in DEV_2 an also format or translate the in-storage processing output224 from DEV_3 or thethird output1628.
Furthering this example, theoutput coordinator1612 in DEV_3 can send the output of thedata preprocessor1608 in DEV_3, thethird output1628, a portion of theapplication request220, or a combination thereof to DEV_1. DEV_1 can return to thehost computer102 or theapplication202 the aggregatedoutput1604 based on thefirst output1624, thesecond output1626, and thethird output1628.
In this example, in-storage processing by one of thestorage devices218 that follows aprevious storage device218 can aggregate the in-storage processing outputs224 of thestorage devices218 that preceded it. In other words, thesecond output1626 is an aggregation of the in-storage processing output224 from the DEV_2 as well as thefirst output1624. Thethird output1628 is an aggregation of the in-storage processing output from DEV_3 as well as thesecond output1626.
Referring now toFIG. 17, therein is shown an example of a flow chart for therequest distributor210 and thedata preprocessor208. Therequest distributor210 and thedata preprocessor208 can be operated in a centralized or decentralized model as described earlier, as examples.
As an overview of this example, this flow chart depicts how theapplication data214 ofFIG. 2 can be translated to the formatteddata216 ofFIG. 2 based on the storage policies. As examples, the storage policies can include the split policy, the split+padding policy, the split+redundancy policy, and storage without any chunking of theapplication units304 ofFIG. 3 to the formattedunits306 ofFIG. 3. This example can represent theapplication request220 ofFIG. 2 as a write request.
Therequest distributor210 ofFIG. 2 can receive theapplication request220 directly or some form theapplication request220 through thehost computer102 ofFIG. 1. Theapplication request220 can include information such as thedata address1204 ofFIG. 12, theapplication data214, thetransfer length302 ofFIG. 3, thelogical boundary1206 ofFIG. 12, or a combination thereof.
As an example, therequest distributor210 can execute achunk comparison1702. Thechunk comparison1702 compares the transfer length with achunk size1704 of thestorage group206, in this example operating as a RAID system. Thechunk size1704 represents a discrete unit of storage size to be stored in thestorage devices218 ofFIG. 2 in thestorage group206 ofFIG. 2. As an example, thechunk size1704 can represent the size of one of the formattedunits306.
If thechunk comparison1702 determines thetransfer length302 is greater than thechunk size1704, the handling of theapplication request220 can continue to aboundary query1706. If thechunk comparison1702 determines that the transfer length is not greater than thechunk size1704, the handling of theapplication request220 can continue to adevice selection1708.
The branch of the flow chart starting with thedevice selection1708 represents the handling of theapplication data214 without chunking of theapplication units304 or theapplication data214. An example of this can be the mirroring function as described inFIG. 6.
Continuing with this branch of the flow chart, thedevice selection1708 determines which of thestorage devices218 in thestorage group206 will store theapplication data214 as part of theapplication request220. Therequest distributor210 can generate thedevice requests1202 ofFIG. 12 as appropriate based on theapplication request220.
When thelogical boundary1206 ofFIG. 12 for theapplication units304 are included with theapplication request220, therequest distributor210 can distribute theapplication request220 by splitting theapplication request220 tosub-application requests222 ofFIG. 2 or by sending identical application requests220 tomultiple storage devices218.
In the example for thesub-application requests222, each of thesub-application requests222 can make the size of each of thesub-application requests222 to be a multiple of thelogical boundary1206 of theapplication units304. The sub-application requests222 can be the device requests1202 issued to thestorage devices218.
In the example for identical application requests220,multiple storage devices218 can receive these application requests220. The first in-storage processing output224 ofFIG. 2 returned can be accepted by theoutput coordinator212 ofFIG. 2 to be returned back to theapplication202. The identical application requests220 can be the device requests1202 issued to thestorage devices218.
When thelogical boundary1206 for theapplication units304 is not included, therequest distributor210 can split theapplication request220 to the sub-application requests222. Thesesub-application requests222 make the size of each of these requests to be an arbitrary length. The requests can be handled as a split function by thedata preprocessor208. The sub-application requests222 can be the device requests1202 issued to thestorage devices218.
Therequest distributor210, thedata preprocessor208, or a combination thereof can continue from thedevice selection1708 to anaddress calculation1710. Theaddress calculation1710 can calculate the address for theapplication data214 or the formatteddata216 to be stored in thestorage devices218 receiving the device requests1202. For illustrative purposes, theaddress calculation1710 is described being performed by therequest distributor210 or thedata preprocessor208, although it is understood that theaddress calculation1710 can be performed elsewhere. For example, thestorage devices218 receiving thedevice requests1202 can perform theaddress calculation1710. Also for example, the address can be a pass-through from theapplication request220 in which case theaddress calculation1710 could have been performed by theapplication202 ofFIG. 2 or by thehost computer102.
The flow chart can continue to awrite non-chunk function1712. Each of thestorage devices218 receiving thedevice request1202 can write theapplication data214 or the formatteddata216 on thestorage device218. Since each of thestorage devices218 contain theapplication data214 in a complete or non-chunked form, any of theapplication data214 or the formatteddata216 can undergo in-storage processing by thestorage device218 with theapplication data214.
Returning to the branch of the flow chart from theboundary query1706, theboundary query1706 determines if thelogical boundary1206 is provided in theapplication request220, as an example. If theboundary query1706 determines that thelogical boundary1206 is provided, the flow chart can continue to apadding query1714. If theboundary query1706 determines that thelogical boundary1206 is not provided, the flow chart can continue to anormal RAID query1716.
The branch of the flow chart starting with thenormal RAID query1716 represents the handling of theapplication data214 with chunking of the application units304 (or some of the application units304). An example of this can be the split function described inFIG. 3. As an example, this branch of the flow chart can be used forunstructured application data214 or forapplication data214 with nological boundary1206. Thechunk size1704 can be with a fixed size or a variable-length size.
Continuing with this branch of the flow chart, thenormal RAID query1716 determines if theapplication request220 is for a normal RAID function as the in-storage processing, or not. If so, the flow chart can continue to achunk function1718. If not the flow chart can continue to another portion of the flow chart or can return an error status back to theapplication202.
In this example, thechunk function1718 can split theapplication data214 or theapplication units304 or some portion of them in thechunk size1704 for thestorage devices218 to receive theapplication data214. As an example, thedata preprocessor208 can perform thechunk function1718 to generate the formatteddata216 or the formattedunits306 with theapplication data214 translated to thechunk size1704. Thedata preprocessor208 can interact with therequest distributor210 to issue thedevice requests1202 to thestorage devices218.
For illustrative purposes, thechunk function1718 is described as being performed by thedata preprocessor208, although it is understood that thechuck function1718 can be executed differently. For example, thestorage devices218 receiving thedevice requests1202 can perform thechunk function1718 as part of the in-storage processing at thestorage devices218.
In this example, the flow chart can continue to awrite chunk function1719. Thewrite chunk function1719 is an example of the in-storage processing at thestorage devices218. Thewrite chunk function1719 writes the formatteddata216 or the formattedunits306 at thestorage devices218 receiving thedevice requests1202 from therequest distributor210.
Returning to the branch of the flow chart from thepadding query1714, the branch below thepadding query1714 represents the handling of theapplication data214 or theapplication units304 or a portion thereof with thedata pads402. An example of this can be the split+padding function as described inFIG. 4.
Thepadding query1714 determines if theapplication data214 or theapplication units304 or some portion of them should be padded to generate the formatteddata216 or the formattedunits306. Thedata preprocessor208 can perform thepadding query1714.
When thepadding query1714 determines that padding of theapplication units304 is needed, the flow chart can continue to anapplication data sizing1720. The application data sizing1720 calculates adata size1722 of theapplication data214 for the split—padding function. Thedata size1722 is the amount of theapplication data214 to be partitioned for the formatteddata216. As an example, the application data sizing1720 can determine thedata size1722 for the amount of theapplication unit304 ormultiple application units304 for each of the formattedunits306. In this example, each of the formattedunits306 are of thechunk size1704 and thedata size1722 is per chunk.
As a specific example, thedata size1722 can calculated withEquation 1 below.
data size 1722=(floor(chunk size 1704/logical boundary 1206))×logical boundary 1206 (Equation 1)
In other words, the data size is calculated with the floor function of thechunk size1704 divided by thelogical boundary1206. The result of the floor function is then multiplied by thelogical boundary1206 to generate thedata size1722.
The flow chart can continue to apad sizing1724. The pad sizing1724 calculates apad size1726 for thedata pads402 for each of the formattedunits306. As an example, thepad size1726 can be calculated withEquation 2 below.
pad size 1726=chunk size 1704−data size 1722 (Equation 2)
In the words, thepad size1726 per chunk or per each of the formattedunits306 can be calculated with thechunk size1704 subtracted by thedata size1722 per chunk or per each of the formattedunits306.
The flow chart can continue to achunk number calculation1728. Thechunk number calculation1728 determines achunk number1730 or the number of the formattedunits306 needed for theapplication data214. Thechunk number1730 can be used to determine the size or length of the formatteddata216. Thedata preprocessor208 can perform thechunk number calculation1728.
The flow chart can continue to asplit function1732. Thesplit function1732 partitions theapplication data214 to thedata size1722 for each of the formattedunits306. Thesplit function1732 is part of generating the formatteddata216 where theapplication units304 are aligned with thechunk size1704 or the formattedunits306. Thedata preprocessor208 can perform thesplit function1732.
The flow chart can continue to awrite pad function1734. Thewrite pad function1734 performs the in-storage processing of writing the formatteddata216 with theapplication data214 partitioned to thedata size1722 and with thedata pads402. Thedata pads402 can include additional information, such as parity, metadata, synchronization fields, or identification fields. Therequest distributor210 can send thedevice requests1202 to thestorage devices218 to perform thewrite pad function1734 of the formatteddata216.
Returning to thepadding query1714, when thepadding query1714 determines that padding of theapplication units304 is not needed, the flow chart can continue to aredundancy query1736. When theredundancy query1736 determines that redundancy of theapplication data214 is needed, then this branch of the flow chart represents the redundancy function. As an example, the redundancy function is described inFIG. 6.
The flow chart can continue from theredundancy query1736 to the application data sizing1720. As an example,FIG. 17 depicts the application data sizing1720 under theredundancy query1736 to be a separate function from the application data sizing1720 under thepadding query1714, although it is understood that the two functions can perform the same operations and can also be the same function. The application data sizing1720 under theredundancy query1736 can be computed using the expression found inEquation 1 described earlier.
The flow chart can continue to achunk function1718. Thechunk function1718 splits or partitions theapplication data214 to the formatteddata216 as described inFIG. 6. Thedata preprocessor208 can perform thechunk function1718. As an example,FIG. 17 depicts thechunk function1718 under thenormal RAID query1716 to be a separate function from thechunk function1718 under theredundancy query1736, although it is understood that the two functions can perform the same operations and can also be the same function.
The flow chart can continue to aredundancy function1738. For each chunk or for each of the formattedunits306, theredundancy function1738 copies thatapplication data214 that is in the range of thedata size1722 and thechunk size1704 to additional chunks to generate thereplica data602 ofFIG. 6.
The flow chart can continue to awrite redundancy function1740. The write redundancy function writes formatteddata216 including theapplication data214 and thereplica data602. Therequest distributor210 asissue device requests1202 to thestorage devices218 to perform thewrite redundancy function1740. Returning to the branch with theredundancy query1736, when theredundancy query1736 determines that redundancy is not needed, the flow chart can continue to thenormal RAID query1716.
For illustrative purposes, the flow chart is described with the split+padding function separately from the redundancy function, although it is understood that the flow chart can provide a different operation. For example, the flow chart can be arranged to provide the split+redundancy function as described inFIG. 5. As an example, this can be accomplished with theredundancy query1736 being placed before thewrite pad function1734. Furthering this example, theredundancy function1738 above could be modified to operate only on thenon-aligned application units304 to form the aligneddata504 ofFIG. 5 as opposed to thereplica data602. The modified redundancy function can be followed by a further write function. The further write function would combine portions of thewrite pad function1734 and thewrite redundancy function1740. Thewrite pad function1734 be utilize a portion of the formatteddata216 with thedata pads402 and thewrite redundancy function1740 can write the aligneddata504 as opposed to thereplica data602.
Referring now toFIG. 18, therein is shown an example of a flow chart for a mirroring function for centralized and decentralized embodiments. As examples, the centralized embodiment can be thecomputing system900 ofFIG. 9 or thecomputing system1000 ofFIG. 10. As an example, the decentralized embodiment can be thecomputing system1100 ofFIG. 11.
The flow chart on the left-hand side ofFIG. 18 represents an example of a flow chart for a centralized embodiment. The flow chart on the right-hand side ofFIG. 18 represents an example of a flow chart for a decentralized embodiment.
Starting with the centralized embodiment, therequest distributor210 ofFIG. 2 can receive theapplication request220 ofFIG. 2. Theapplication request220 can include thedata address1204 ofFIG. 12, theapplication data214 ofFIG. 2, and thetransfer length302 ofFIG. 3.
For example, thedata preprocessor208 ofFIG. 2 can execute areplica query1802. Thereplica query1802 determines if thereplica data602 ofFIG. 6 should be created or not. As an example, thereplica query1802 can make this determines by comparing if anumber1804 ofreplica data602 being requested is greater than zero. If so, the flow chart can continue to a createreplica1806. If not, the flow chart can continue to thedevice selection1708.
As an example, thedevice selection1708 can be the same function or perform the same or similar function as described inFIG. 17. The flow chart can continue to theaddress calculation1710. As with thedevice selection1708, theaddress calculation1710 can be the same function or perform the same or similar function as described inFIG. 17. The flow chart can continue to thewrite non-chunk function1712. As with theaddress calculation1710, thewrite non-chunk function1712 can be the same function or perform the same or similar function as described inFIG. 17.
As an example, therequest distributor210 can execute thedevice selection1708, theaddress calculation1710, or a combination thereof include the outputs of these operations as part of thedevice request1202 ofFIG. 12. Thewrite non-chunk function1712 can be performed by one of thestorage devices218 to store theapplication data214.
Returning to thereplica query1802, when thereplica query1802 determines thereplica data602 ofFIG. 6 should be generated, then the flow chart can continue to the createreplica1806. As an example, thereplica query1802 can make this determination when thenumber1804 of replica sought is greater than zero.
In this example, the createreplica1806 can generate thereplica data602 from theapplication data214. Thereplica data602 can be as described inFIG. 6. As an example, thedata preprocessor208 can perform the createreplica1806. The createreplica1806 can generate thenumber1804 of thereplica data602 as needed and not just one.
The flow chart can continue to aprepare replica1808. As an example, therequest distributor210 can prepare each of thereplica data602 for thedevice selection1708. Thereplica data602 can be written to thestorage devices218 following the flow chart from thedevice selection1708, as already described.
Returning to the flow chart for the decentralized embodiment on the right-hand side ofFIG. 18, therequest distributor210 can receive theapplication request220. Theapplication request220 can include thedata address1204, theapplication data214, thetransfer length302, and thenumber1804 of thereplica data602.
Therequest distributor210 can send one of thedevice requests1202 to one of thestorage devices218. Thatstorage device218 can perform theaddress calculation1710. As an example, theaddress calculation1710 can be the same function or perform the same or similar function as described inFIG. 17 and as for the centralized embodiment.
In this example, thesame storage device218 can also perform thewrite non-chunk function1712. As an example, thewrite non-chunk function1712 can be the same function or perform the same or similar function as described inFIG. 17 and as for the centralized embodiment.
The flow chart can continue to thereplica query1802. As an example, the replica query can be the same function or perform the same or similar function as described for the centralized embodiment. If thenumber1804 for thereplica data602 is not greater than zero, the process to write additional data stops for thisparticular application request220.
If thereplica query1802 determines that thenumber1804 for thereplica data602 is greater than zero, then the flow chart can continue agroup selection1810. Thegroup selection1810 can select one of thestorage devices218 in thesame replica group1812. Thereplica group1812 is a portion of thestorage devices218 ofFIG. 2 in thestorage group206 ofFIG. 2 designated to be part of a redundancy function for theapplication data214 and for in-storage processing. Therequest distributor210 can perform thereplica query1802, thegroup selection1810, or a combination thereof.
The flow chart can continue to anumber update1814. Thenumber update1814 can decrement thenumber1804 forreplica data602 still to be written to thereplica group1812. The decrement amount can be by an integer value, such as one. Therequest distributor210 can perform thenumber update1814.
The flow chart can continue to arequest generation1816. Therequest generation1816 generates one of thedevice requests1202 to another of thestorage devices218 in thereplica group1812 for writing thereplica data602. Therequest distributor210 can perform therequest generation1816.
The flow chart can loop back (not drawn inFIG. 18) to thereplica query1802 and iterate until thenumber1804 has reached zero. At this point, thereplica data602 has been written to thereplica group1812.
For illustrative purposes, the decentralized embodiment is described as operating in a serial manner writing to one of thestorage devices218 at a time, although it is understood that the decentralized embodiment can operate differently. For example, therequest distributor210 can issue a number ofdevice requests1202 to thestorage devices218 in thereplica group1812 and have thereplica data602 written onmultiple storage devices218 simultaneously before theother storage devices218 in the replica group completes the write.
It has been discovered that the computing system provides efficient distributed processing by providing methods and apparatuses for performing in-storage processing with multiple storage devices, with capabilities for performing in-storage processing of application data. An execution of an application can be shared by distributing the execution among various devices in a storage device. Each of the devices can perform in-storage processing with the application data as requested by an application request.
It has also been discovered that the computing system can reduce overall system power consumption by reducing the number of inputs/outputs between the application execution and the storage device. This reduction is achieved by having the devices perform the in-storage processing instead of mere storage, read, and re-store by the application. Instead, the in-storage processing outputs can be returned as an aggregated output from the various devices that performed the in-storage processing back to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
It has been discovered that the computing system provides for reduced total cost of ownership by providing formatting and translation function of the application data for different configuration or organization of the storage device. Further, the computing system also provides translation for the type of in-storage processing to be carried out by the devices in the storage device. Examples of types of translation or formatting include split, split+padding, split+redundancy, and mirroring.
It has been discovered that the computing system provides more efficient execution of the application with less interrupts to the application via the output coordination of the in-storage processing outputs from the storage devices. The output coordination can buffer the in-storage processing outputs and can also sort the order of each of the in-storage processing outputs before returning an aggregated output to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
It has been discovered that the computing system further minimizes integration obstacles by allowing the devices in the storage group to have different or the same functionalities. As an example, one of the devices can function as the only output coordinator for all the in-storage processing outputs from the other devices. As a further example, the aggregation function can be distributed amongst the devices passing along and performing partial aggregation from device to device until one of the devices returns the full aggregated output back to the application. The application can continue to execute and utilize the in-storage outputs, the aggregated output, or a combination thereof.
The modules described in this application can be hardware implementations or hardware accelerators in thecomputing system100. The modules can also be hardware implementation or hardware accelerators within thecomputing system100 or external to thecomputing system100.
The modules described in this application can be implemented as instructions stored on a non-transitory computer readable medium to be executed by thecomputing system100. The non-transitory computer medium can include memory internal to or external to thecomputing system100. The non-transitory computer readable medium can include non-volatile memory, such as a hard disk drive, non-volatile random access memory (NVRAM), solid-state storage group (SSD), compact disk (CD), digital video disk (DVD), or universal serial bus (USB) flash memory devices. The non-transitory computer readable medium can be integrated as a part of thecomputing system100 or installed as a removable portion of thecomputing system100.
Referring now toFIG. 19, therein is shown a flow chart of amethod1900 of operation of acomputing system100 in an embodiment of the present invention. Themethod1900 includes: performing in-storage processing with a storage device with formatted data based on application data from an application in ablock1902; and returning an in-storage processing output from the storage device to the application for continued execution in ablock1904.
Themethod1900 can further include receiving a sub-application request at the storage device based on an application request from the application for performing in-storage processing. Themethod1900 can further include sorting in-storage processing outputs from a storage group including the storage device. Themethod1900 can further include issuing a device request based on an application request from the application to a storage group including the storage device.
Themethod1900 can further include issuing a device request from the storage device; receiving the device request at another storage device; generating another device request by the another storage device; and receiving the another device request by yet another storage device
Themethod1900 can further include sending in-storage processing outputs by a storage group include the storage device to be aggregated and sent to the application. Themethod1900 can further include aggregating an in-storage processing output as a partial aggregated output to be returned to the application. Themethod1900 can further include generating the formatted data based on the application data. Themethod1900 can further include generating a formatted unit of the formatted data with an application unit of the application data and a data pad. Themethod1900 can further include generating a formatted unit of the formatted data with non-aligned instances of application units of the application data and a data pad.
While the invention has been described in conjunction with a specific best mode, it is to be understood that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the aforegoing description. Accordingly, it is intended to embrace all such alternatives, modifications, and variations that fall within the scope of the included claims. All matters set forth herein or shown in the accompanying drawings are to be interpreted in an illustrative and non-limiting sense.