BACKGROUND OF THE DISCLOSURENAND flash memory, as well as other types of non-volatile memories (“NVMs”), are commonly used for mass storage. For example, consumer electronics such as portable media players often include flash memory to store music, videos, and other media.
Because a file system of a device may perform multiple small synchronous input/outputs (“I/Os”), die parallelism cannot be achieved in some cases. This consequently may reduce system bandwidth, and increase the latency for programming the pages of a NVM.
In some scenarios, the latency for programming a NVM can be improved by buffering one or more I/Os prior to programming the I/Os to the NVM. However, in other scenarios, if I/Os can instead be directly programmed to the NVM, indiscriminate buffering of all I/Os received from a file system may actually slow down the programming process.
SUMMARY OF THE DISCLOSURESystems and methods are disclosed for efficient buffering of a system having non-volatile memory (“NVM”). The system can select to buffer writes where the buffering will increase system performance, and select to program writes directly to a NVM where the buffering would be a net performance burden. In some embodiments, a system can use heuristics to determine whether to perform buffering of one or more write commands received from a file system. In other embodiments, the system can minimize read energy and buffering overhead by efficiently re-ordering write commands in a queue along page-aligned boundaries of a buffer. In further embodiments, the system can optimally combine write commands from a buffer with write commands from a queue. After combining the commands, the system can dispatch the commands in a single transaction.
BRIEF DESCRIPTION OF THE DRAWINGSThe above and other aspects and advantages of the invention will become more apparent upon consideration of the following detailed description, taken in conjunction with accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
FIGS. 1 and 2 are block diagrams of electronic devices configured in accordance with various embodiments of the invention;
FIG. 3 is a functional view of a non-volatile memory in accordance with various embodiments of the invention;
FIG. 4 is a graphical view of illustrative timing diagrams for various systems;
FIG. 5 is a graphical view of an illustrative programming flow for a system configured in accordance with various embodiments of the invention;
FIG. 6 is a graphical view of an illustrative buffer configured in accordance with various embodiments of the invention;
FIGS. 7A and 7B are flowcharts of an illustrative process for efficient buffering in a non-volatile memory in accordance with various embodiments of the invention; and
FIG. 8 is a flowchart of an illustrative process for determining whether to buffer in a non-volatile memory in accordance with various embodiments of the invention.
DETAILED DESCRIPTION OF THE DISCLOSURESystems and methods for efficient buffering of a system having non-volatile memory (“NVM”) are provided. A control circuitry of a system can use heuristics to determine whether or not to buffer write commands received from a file system. Any suitable heuristics can be used, such as, for example, the size of a write command, the number of dispatched write command immediately preceding the write command, the number of remaining write commands saved in a queue, the amount of available space in memory, the size of a buffer, and input/output (“I/O”) patterns.
In some embodiments, the control circuitry can determine whether or not to buffer a write command based on the size of command. For example, if the control circuitry determines that the write command is a relatively small command (e.g., the size of the write command is below a pre-determined threshold), the control circuitry can select a buffered path for the write command (e.g., transfer the write command to a buffer in memory). If the control circuitry instead determines that the write command is a relatively large command (e.g., the size of the write command exceeds a pre-determined threshold), the control circuitry can select a write path for the write command (e.g., direct a bus controller to dispatch the write command to a NVM).
In some embodiments, instead of automatically selecting a buffered path for small write commands, the control circuitry can use additional heuristics to determine if write commands saved in a queue can be combined into a write-multi command directly in the queue. Once the commands have been combined into the write-multi command, the control circuitry can dispatch the write-multi command directly from the queue to a NVM via a write path.
In other embodiments, in order to maximize concurrency, the control circuitry can attempt to re-order the write commands in a queue along page-aligned boundaries of a buffer. Consequently, read energy can be minimized on a subsequent read because user data is often read with the same alignments and granularity as the alignments and granularity with which the user data was programmed. In further embodiments, a control circuitry can combine buffered write commands and queue write commands into the same batch or transaction to improve system efficiency.
FIG. 1 illustrates a block diagram of electronic device100. In some embodiments, electronic device100 can be or can include a portable media player, a cellular telephone, a pocket-sized personal computer, a personal digital assistance (“PDA”), a desktop computer, a laptop computer, and any other suitable type of electronic device.
Electronic device100 can include system-on-a-chip (“SoC”)110 and non-volatile memory (“NVM”)120. Non-volatilememory120 can include a NAND flash memory based on floating gate or charge trapping technology, NOR flash memory, erasable programmable read only memory (“EPROM”), electrically erasable programmable read only memory (“EEPROM”), Ferroelectric RAM (“FRAM”), magnetoresistive RAM (“MRAM”), or any combination thereof.
NVM120 can be organized into “blocks”, which can the smallest erasable unit, and further organized into “pages”, which can be the smallest unit that can be programmed or read. In some embodiments, NVM120 can include multiple integrated circuits, where each integrated circuit may have multiple blocks. Memory locations (e.g., blocks or pages of blocks) from corresponding integrated circuits may form “super blocks”. Each memory location (e.g., page or block) ofNVM120 can be referenced using a physical address (e.g., a physical page address or physical block address).
System-on-a-chip110 can include SoC control circuitry112, memory114, encryption module116, and NVM interface118. SoC control circuitry112 can control the general operations and functions ofSoC110 and the other components ofSoC110 or device100. For example, responsive to user inputs and/or the instructions of an application or operating system, SoC control circuitry112 can issue read or write commands to NVM interface118 to obtain data from or store data inNVM120. For clarity, data that SoC control circuitry112 may request for storage or retrieval may be referred to as “user data,” even though the data may not be directly associated with a user or user application. Rather, the user data can be any suitable sequence of digital information generated or obtained by SoC control circuitry112 (e.g., via an application or operating system).
SoC control circuitry112 can include any combination of hardware, software, and firmware, and any components, circuitry, or logic operative to drive the functionality of electronic device100. For example, SoC control circuitry112 can include one or more processors that operate under the control of software/firmware stored in NVM120 or memory114.
SoC control circuitry112 can dispatch one or more commands toNVM120. In some embodiments, SoC control circuitry112 can include a block device driver or wrapper that can be configured to dispatch application programming interface (“API”) operations toNVM120 or a controller ofNVM120. In some embodiments, SoC control circuitry112 can modify one or more parameters of the block device driver or wrapper in order to transfer information toNVM120. For example, by modifying the one or more parameters, SoC control circuitry112 can transfer information associated with commands used to access NVM120 (e.g., read, program, erase, and/or write-multi commands).
Memory114 can include any suitable type of volatile memory, such as random access memory (“RAM”) (e.g., static RAM (“SRAM”), dynamic random access memory (“DRAM”), synchronous dynamic random access memory (“SDRAM”), double-data-rate (“DDR”) RAM), cache memory, read-only memory (“ROM”), or any combination thereof. Memory114 can include a data source that can temporarily store user data for programming into or reading fromnon-volatile memory120. In some embodiments, memory114 may act as the main memory for any processors implemented as part of SoC control circuitry112.
In some embodiments, memory114 can include one or more command queues (not shown inFIG. 1) for saving commands (e.g., read, write, and/or trim commands) received from a file system that have not yet been dispatched to NVM120. The one or more command queues can be block-device level queues. For example, memory114 can include a read queue for retaining read commands, a write queue for retaining write commands, and a trim queue for retaining trim commands. Consequently, SoC control circuitry112 can scan one or more queues in order to select one or more commands that may be dispatched toNVM120. After the one or more commands have been dispatched, SoC control circuitry112 can remove those commands from the queue. Persons skilled in the art will appreciate that memory114 may instead include one queue configured to retain all types of commands (e.g., read, write, and trim commands).
The one or more queues will generally be serviced until the queues are empty. In some embodiments, SoC control circuitry112 can have a set of fairness rules for servicing each of the separate queues. For example, SoC control circuitry112 may attempt to service the queues such that each of the queues is serviced at approximately the same rate as the number of commands that are currently included in the queue. The order in which the queues are serviced affects the quality of service of the system.
In some embodiments, because commands may not be able to be saved in the queues indefinitely (e.g., other threads or processes may be waiting for those I/Os to complete), memory114 can include a buffer (not shown inFIG. 1) that can provide for longer-term storage of commands. For example, in some cases, SoC control circuitry112 can select one or more write commands saved in a queue, and transfer the selected write commands to the buffer. After saving the write commands to the buffer, SoC control circuitry112 can remove those write commands from the queue.
Persons skilled in the art will appreciate that the buffer in memory114 can have any suitable size. For example, the buffer may be a relatively small buffer capable of storing less than one page of data. However, if the minimum program granularity is one page or larger, the buffer may be a relatively large buffer capable of storing one or more pages of data. For instance, the buffer may have a size of one page, one stripe, multiple stripes, a super block, or any other suitable size.
Encryption module116 can be or may include any hardware or software, or combination thereof, configured to perform encryption and decryption based on a suitable cipher. For example, encryption module116 can be based on the Advanced Encryption Standard (“AES”), Data Encryption Standard (“DES”), or RSA. Encryption module116 can provide security for sensitive data, such as personal information or billing information, stored onNVM120 or transmitted/received to/fromSoC110. Although encryption module116 is shown inFIG. 1 as included inSoC110, persons skilled in the art will appreciate that encryption module116 may instead be implemented inNVM120.
In some embodiments, encryption module116 can encrypt and decrypt data using one or more “encryption seeds” provided by control circuitry (e.g., SoC control circuitry112 or a controller on NVM120), which may be required by the encryption algorithm to perform encryption or decryption. In some embodiments, and particularly for AES-based encryption modules, the encryption seeds can include a key and an initialization vector (“IV”). To recover the original unencrypted data from encrypted data, the encryption seeds used for decryption may need to be the same as the seeds originally used for encryption.
NVM interface118 may include any suitable combination of hardware, software, and/or firmware configured to act as an interface or driver between SoC control circuitry112 andNVM120. For any software modules included in NVM interface118, corresponding program code may be stored inNVM120 or memory114.
NVM interface118 can perform a variety of functions that allow SoC control circuitry112 to accessNVM120 and to manage the memory locations (e.g., pages, blocks, super blocks, integrated circuits) ofNVM120 and the data stored therein (e.g., user data). For example, NVM interface118 can interpret the read or write requests from SoC control circuitry112, perform wear leveling, and generate read and program instructions compatible with the bus protocol ofNVM120.
While NVM interface118 and SoC control circuitry112 are shown as separate modules, this is intended only to simplify the description of the embodiments of the invention. It should be understood that these modules may share hardware components, software components, or both. For example, SoC control circuitry112 may execute a software-based memory driver for NVM interface118.
In some embodiments, electronic device100 can include a target device, such as a flash memory drive or Secure Digital (“SD”) card, that includesNVM120 and some or all portions of NVM interface118 (e.g., a translation layer, discussed below). In these embodiments,SoC110 or SoC control circuitry112 may act as the host controller for the target device. For example, as the host controller,SoC110 can issue read and write requests to the target device.
FIG. 2 illustrates a block diagram ofelectronic device200, which may illustrate in greater detail some of the firmware, software, and/or hardware components of electronic device100 (FIG. 1) in accordance with various embodiments.Electronic device200 may have any of the features and functionalities described above in connection withFIG. 1, and vice versa. As shown, dashed lines demarcate the layers. It is understood that the depiction of which components fall within the demarcation lines are merely illustrative and that one or more components can be affiliated with a different layer.
Electronic device200 can includefile system210,NVM driver212,NVM bus controller216, andNVM220. In some embodiments,file system210 andNVM driver212 may be software or firmware modules, andNVM bus controller216 andNVM220 may be hardware modules. Accordingly, in these embodiments,NVM driver212 may represent the software or firmware aspect ofNVM interface218, andNVM bus controller216 may represent the hardware aspect ofNVM interface218.
File system210 can include any suitable type of file system, such as a File Allocation Table (“FAT”) file system or a Hierarchical File System Plus (“HFS+”), and may be part of the operating system of electronic device200 (e.g., part of SoC control circuitry112 ofFIG. 1). In some embodiments,file system210 may include a flash file system, which provides a logical to physical mapping of pages. In these embodiments,file system210 may perform some or all of the functionalities ofNVM driver212 discussed below, and therefore filesystem210 andNVM driver212 may or may not be separate modules.
File system210 may manage file and folder structures for the application and operating system.File system210 may operate under the control of an application or operating system running onelectronic device200, and may provide write and read commands toNVM driver212 when the application or operating system requests that information be read from or stored inNVM220. Along with each read or write command,file system210 can provide a logical address to indicate where the user data should be read from or written to, such as a logical page address or a logical block address (“LBA”) with a page offset.
File system210 may provide read and write requests toNVM driver212 that are not directly compatible withNVM220. For example, the logical addresses may use conventions or protocols typical of hard-drive-based systems. A hard-drive-based system, unlike flash memory, can overwrite a memory location without first performing a block erase. Moreover, hard drives may not need wear leveling to increase the lifespan of the device. Therefore,NVM interface218 can perform any functions that are memory-specific, vendor-specific, or both to handle file system requests and perform other management functions in a manner suitable forNVM220.
NVM driver212 can includetranslation layer214. In some embodiments,translation layer214 may be or include a flash translation layer (“FTL”). On a write command,translation layer214 can map the provided logical address to a free, erased physical location onNVM220. On a read command,translation layer214 can use the provided logical address to determine the physical address at which the requested data is stored. Because each NVM may have a different layout depending on the size or vendor of the NVM, this mapping operation may be memory and/or vendor-specific.Translation layer214 can perform any other suitable functions in addition to logical-to-physical address mapping. For example,translation layer214 can perform any of the other functions that may be typical of flash translation layers, such as garbage collection and wear leveling.
NVM driver212 may interface withNVM bus controller216 to complete NVM access commands (e.g., program, read, and erase commands).Bus controller216 may act as the hardware interface toNVM220, and can communicate withNVM220 using the bus protocol, data rate, and other specifications ofNVM220.
In some embodiments, upon receiving a command (e.g., a write command) fromfile system210,NVM interface218 can directNVM bus controller216 to dispatch a command (e.g., an API operation) toNVM220. In some embodiments,NVM interface218 may usetranslation layer214 to obtain the physical addresses of one or more memory locations ofNVM220.
NVM interface218 may manageNVM220 based on memory management data, sometimes referred to herein as “metadata”. The metadata may be generated byNVM driver212 or may be generated by a module operating under the control ofNVM driver212. For example, metadata can include any information used for managing the mapping between logical and physical addresses, bad block management, wear leveling, error-correcting code (“ECC”) data used for detecting or correcting data errors, or any combination thereof. The metadata may include data provided byfile system210 along with the user data, such as a logical address. Thus, in general, “metadata” may refer to any information about or relating to user data or used generally to manage the operation and memory locations of a non-volatile memory.
NVM interface218 may be configured to store metadata inNVM220. In some embodiments,NVM interface218 may store metadata associated with user data at the same memory location (e.g., page) in which the user data is stored. For example,NVM interface218 may store user data, the associated logical address, and ECC data for the user data at one or more memory locations ofNVM220.NVM interface218 may also store other types of metadata about the user data in the same memory location.
NVM interface218 may store the logical address so that, on power-up ofNVM220 or during operation ofNVM220,electronic device200 can determine what data resides at that location. In particular, becausefile system210 may reference the user data according to its logical address and not its physical address,NVM interface218 may store the user data and logical address together to maintain their association. This way, even if a separate table maintaining the physical-to-logical mapping inNVM220 becomes outdated,NVM interface218 may still determine the proper mapping at power-up or reboot ofelectronic device200, for example.
As discussed above, a NVM (e.g.,NVM120 ofFIG. 1 orNVM220 ofFIG. 2) can be organized into dies, blocks, pages, super blocks, and the like. For example,FIG. 3 shows a schematic layout ofNVM300.NVM300 can be the same as or similar to NVM120 (FIG. 1) or NVM220 (FIG. 2).FIG. 3 is merely meant to illustrate the organizational layout ofNVM300 and does not indicate an actual, physical layout of the non-volatile memory. For instance, although die0 is illustrated as being next to die1 inFIG. 3, this is merely for illustrating the functional relationship of these dies, and in the actual, physical layout ofNVM300, these dies may or may not be located near one another.
Moreover, although a certain number of dies, blocks, and pages are shown inFIG. 3, this is merely for the purpose of illustration and one skilled in the art could appreciate thatNVM300 could include any suitable number of dies, blocks, and pages. As one illustration, each die ofNVM300 can include 4096 blocks, each block may include 64, 128, or 192 pages based on whether the block is an SLC block, 2-bit MLC block, or 3-bit MLC block. In addition, each page ofNVM300 can have any suitable size such as, for example, 2 KiB, 4 KiB, 8 KiB, or 16 KiB.
As illustrated byFIG. 3,NVM300 can include one or more dies (i.e., integrated circuits), such asdie0, die1, die2, and die3. Each die ofNVM300 may be organized into one or more “blocks”. For example, dies0-3 is illustrated as each being organized into blocks0-3.
Each block of the dies may be organized into one or more pages. For example, block0 of each of dies0-3 is illustrated as being organized into pages0-3. Each page ofNVM300 can include any suitable information. For example, the pages can include user data, metadata, or both. In some embodiments, metadata such as ECC data can be included in the pages to provide for error detection and/or correction of the information stored inNVM300.
Over time, page sizes have increased in order to compensate for the longer time that it takes to program a NVM. That is, because memory cells of NVMs have decreased in size, more precise pulses have to be applied to memory cells of a NVM in order to program data in those cells. Accordingly, by increasing the size of pages, the same amount of user data can be programmed during any particular period of time. The net effect of this is that a user of an electronic device will not experience a reduction in the overall programming speed of a NVM.
NVM300 can also include one or more super blocks that include one block from each die that has the same position or “block number” in the die. For example,super block0 ofNVM300 can include block0 of each of dies0-3. Similarly,super block1 ofNVM300 can include block1 of each of dies0-3,super block2 ofNVM300 can include block2 of each of dies0-3, and so forth.
Super blocks can be formed by virtually linking or “striping” together one or more blocks. Blocks need not be in the same row of each die to be virtually linked as a super block. In fact, blocks may be chosen randomly from two or more dies to form a super block. In some embodiments, a super block can include only one block. Super blocks provide operational parallelism, thereby enabling programming, reading, and erase operations to be performed on blocks located in different dies in parallel.
In addition, pages from two or more dies may be virtually linked together to form super pages or “stripes”. For instance,stripe302 can includepage0 fromblock0 of each of dies0-3. Similarly,stripe304 can includepage3 fromblock0 of each of dies0-3. A translation layer (e.g.,translation layer214 ofFIG. 2) being implemented in an NVM interface may keep track of super blocks or stripes. Generally, the programming time can remain the same for pages that are included in a particular stripe (e.g.,stripe302 or304).
A file system of a system (e.g.,file system210 ofFIG. 2) can operate in a logical space. Thus, as discussed above, each file in the system can map to one or more LBAs. Correspondingly, the one or more LBAs can be mapped to one or more physical page addresses in the NVM.
However, when a NVM is configured to have larger pages capable of storing multiple LBAs, the file system may sometimes issue a write command that may not completely fill up a physical page in a NVM. In conventional systems, a control circuitry (e.g., control circuitry112 ofFIG. 1) can pad out the unfilled portions of a page with invalid data, and program the write command along with the invalid data to the NVM. This operation wastes space in the NVM, and may consequently increase the amount of GC that needs to be performed on the NVM.
Accordingly, in order to increasing system efficiency, a system can be configured to dispatch multiple write commands to a NVM in a single transaction.
FIG. 4 show illustrative timing diagrams for various systems. For example, as shown inFIG. 4, the programming time (t_prog) is not substantially different if one page (as shown in timing diagram400) or multiple pages (as shown in time diagram402) are being programmed in a single operation (e.g., as a write-multi command). This may be particularly true if the multiple pages are included in a single stripe.
In addition, the time for startup programming (t_startup) also do not vary substantially if one page or multiple pages are being programmed in a single operation. As used herein, the “time for startup programming” refers to the time that it takes to activate the NVM interface (e.g.,NVM interface218 ofFIG. 2).
Moreover, under most scenarios, the time that it takes to transfer contents to a page buffer (t_transfer) is relatively short as compared to t_prog. For example, as shown in timing diagram402, t_transfer for programming a write-multi command can still be of shorter duration as compared to t_prog. However, as shown in timing diagram404, programming multiple pages separately is less efficient due to the combination of multiple t_progs. The efficiency of the system is especially impacted if the amount of user data that can be programmed each time is relatively small.
Consequently, to increase the amount of die programming that can be performed in parallel and the amount of user data that can be programmed in a particular I/O transaction, a control circuitry can combine multiple write commands into a single write-multi command. Persons skilled in the art will appreciate that a write-multi command can be formed using any suitable approach. For example, the control circuitry can form a write-multi command by combining a set of logical addresses, a set of counters, and a set of buffers associated with the multiple commands into an array of logical addresses, an array of counters, and an array of buffers, respectively.
After forming the write-multi command, the control circuitry can dispatch the write-multi command to a NVM (e.g., in the form of an API operation). By forming and dispatching a write-multi command, user data can be programmed more efficiently in the NVM without wasting valuable space.
In some embodiments, the system may have several options for splitting data writes into multiple write paths.FIG. 5 shows a graphical view of anillustrative programming flow500 for a system.
As discussed, a file system (e.g.,file system210 ofFIG. 2) can issue one or more commands (e.g., read, write, and/or trim commands) that can be saved in one or more queues in the system. For example, as shown inFIG. 5,queue502 can represent a write queue that can be configured to retain one or more write commands. However, persons skilled in the art will appreciate that the system may include additional queues for storing read and/or trim commands. Alternatively,queue502 can be configured to store write, read, and trim commands.
In some embodiments, after a write command is received inqueue502, a control circuitry (e.g., control circuitry112 ofFIG. 1) can select one of two paths for dispatching the write command. For example, the control circuitry can select a write path (e.g., path504) for dispatching the write command to NVM510. NVM510 can be the same as or similar to NVM120 (FIG. 1), NVM220 (FIG. 2), or NVM300 (FIG. 3).
In some cases, by selectingpath504, the write command can be dispatched directly tohardware engine506.Hardware engine504 can be, for example, a direct memory access (“DMA”) engine.
In some embodiments, prior to dispatching the write command tohardware engine506, the control circuitry can encrypt the write command using an encryption module (e.g., encryption module116 ofFIG. 1). As discussed, the encryption module can encrypt the write command using any suitable encryption standard such as, for example, AES, DES, or RSA.
Subsequently, the control circuitry can pass the write command toNVM interface508.NVM interface508 can be the same as or substantially similar toNVM interface218 ofFIG. 8. OnceNVM interface508 receives the write command, a NVM bus controller (e.g.,NVM bus controller216 ofFIG. 2) can be directed to dispatch the write command to NVM510 across bus512.
In some cases, instead of selectingpath504, the control circuitry can select a buffered path (e.g., path520) for a particular write command. By selectingpath504, the control circuitry can transfer the write command fromqueue502 to buffer522. As discussed above, buffer522 can be stored in volatile memory (e.g., memory114 ofFIG. 1), and can be configured to store any suitable number of write commands.
By storing one or more write commands inbuffer522, the control circuitry can wait to receive additional write commands from the file system before directing a bus controller to concurrently dispatch multiple commands to NVM510 viapath524. For example, the multiple commands can be combined as a write-multi command, which can then be dispatched to NVM510.
In some embodiments, the control circuitry can direct an encryption module to encrypt the write command before the write command is transferred to buffer522. After encrypting the write command, the control circuitry can store the encrypted write command inbuffer522.
In other embodiments, the control circuitry can transfer an unencrypted write command to buffer522. In such embodiments, however, the control circuitry may need to copy one or more encryption seeds (e.g., one or more keys and IVs) to volatile memory (e.g., memory114 ofFIG. 1). Then, when the write command is ready to be dispatched frombuffer522, the control circuitry can direct an encryption module to use the stored encryption seeds to encrypt the write command. Consequently, for such a configuration,buffer522 may be associated with unique encryption characteristics (e.g., a fixed set of encryption seeds).
In order to retain system coherency, one or more “side-band signals” may be issued by an application, operating system, or file system of a system. As used herein, “side-band signals” can refer to any suitable signal that may be generated by a cache algorithm of an application, operating system, or file system to maintain coherency. For example, one type of side-band signal may be a synchronized cache signal, which may be generated by one or more applications in the system. Another side-band signal may be a force unit access (“FUA”) signal, which can be generated by HFS+.
In response to receiving one or more of these side-band signals, the control circuitry can direct a bus controller to flush the buffer. That is, the control circuitry can direct the bus controller to dispatch all of the commands currently stored in the buffer to the NVM viapath524.
Becausepath520 involves an extra step of copying a write command fromqueue502 to buffer522, additional traffic may be generated in the system. As a result, in some scenarios,path520 may be less efficient as compared topath504. For example, if the transferred command is a large command, it may be more efficient to directly program the command to NVM510 viapath504.
On the other hand, if the control circuitry indiscriminately selectspath504 for dispatching all commands (including small commands), system performance may also decrease. Accordingly, the control circuitry may need to apply one or more heuristics to select a path that is best suited for dispatching a particular write command.
One of the heuristics that may be used is the size of a write command. Based at least in part on the size of the write command, the control circuitry can determine whether or not to buffer a write command.
For example, the control circuitry can first determine the size of a write command that is saved inqueue502. The control circuitry can then determine if the size of the write command is below a pre-determined threshold. In response to determining that the size of the write command is below the pre-determined threshold (e.g., the write command is a relatively small write command), the control circuitry can determine to perform buffering of the write command.
Consequently, the control circuitry can transfer the write command fromqueue502 to buffer522 viapath520. Once the write command has been stored inbuffer522, the command can be combined with other write commands at a later time.
If the control circuitry instead determines that the size of the write command is above the pre-determined threshold (e.g., the write command is a relatively large write command), the control circuitry can determine not to perform buffering of the write command. As a result, the control circuitry can direct a bus controller to dispatch the write command fromqueue502 to NVM510 viapath504.
Accordingly, by selecting a path based on the size of a write command, the control circuitry can reduce the total required bandwidth for executing write commands from an overall system perspective.
In other embodiments, in addition to the size of a write command, the control circuitry can use other heuristics to determine which ofpath504 orpath520 to select. For example, the control circuitry can use one or more heuristics to determine if system efficiency can be improved by combining multiple write commands into a write-multi command directly inqueue502. In particular, once the write commands have been combined into the write-multi command, the write-multi command can be dispatched directly fromqueue502 viapath504.
One of the heuristics that can be used is the number of write commands saved inqueue502. For instance, the control circuitry can determine whether the number of write commands saved inqueue502 exceeds a pre-determined number. In response to determining that the number of write commands exceeds the pre-determined number, the control circuitry can combine the write commands into a write-multi command, and can direct a bus controller to dispatch the write-multi command fromqueue502 to NVM510. This way, the control circuitry can gain the benefits of concurrently dispatching multiple write commands, while also avoiding the extra step of copying a write command to buffer522.
Another heuristic that may be used is I/O patterns. For example, the control circuitry can assume that I/O generally arrive in batches. For instance, during a particular period of time, the control circuitry may be more likely to receive a burst of read commands or write commands rather than write commands interspersed with read commands. Consequently, if the control circuitry has already received a series of sequential writes, the control circuitry may expect to receive additional sequential writes in the near future.
Under such an assumption, the control circuitry can determine whether a set of the most recently received write commands inqueue502 is sequential. In response to determining that the set of the most recently received write commands is sequential, the control circuitry can wait to receive additional write commands inqueue502.
Once the control circuitry receives those additional write commands, the control circuitry can combine the additional write commands with the set of the most recently received commands into a write-multi command. The control circuitry can then direct a bus controller to dispatch the write-multi command fromqueue502 to NVM510. Consequently, the control circuitry can bypass buffer522 for sequential write commands.
In some embodiments, while the control circuitry is waiting to receive additional write commands inqueue502, the control circuitry can instead service a read queue and/or a trim queue stored in memory. As such, the control circuitry can ignore existing fairness rules, and allow write commands to stack up inqueue502.
As another example, the control circuitry can determine whether or not to buffer a small write command based on the number of dispatched write commands immediately preceding the write command and/or the amount of time that has elapsed since the last dispatched write command. That is, based on an observation of past access patterns, the control circuitry can determine that additional write commands may likely arrive in the near future. Consequently, it may be more efficient to hold on to a small write command inqueue502, and wait for additional write commands.
For instance, if the control circuitry determines that the number of dispatched write commands immediately preceding the write command exceeds a pre-determined number, the control circuitry can wait to receive one or more additional write commands inqueue502. Similarly, the control circuitry may wait to receive additional write commands if the amount of time that has elapsed since the last dispatched write command is less than a pre-determined time period (e.g., a few milliseconds).
Once the control circuitry receives one or more additional write commands inqueue502, the control circuitry can combine the additional write commands with the small write command into a write-multi command. The control circuitry can then direct a bus controller to dispatch the write-multi command fromqueue502 to NVM510.
In the examples discussed above, persons skilled in the art will appreciate that the control circuitry can transfer one or more write commands fromqueue502 to buffer522 after a period of time has elapsed (e.g., a few milliseconds). That is, if the control circuitry does not actually receive the expected write commands, the control circuitry can transfer the small write command to buffer522.
As yet another example, the control circuitry can determine whether or not to buffer a small write command based on the number of remaining write commands saved inqueue502. For instance, ifqueue502 only has a single write command or a small number of write commands, the control circuitry can transfer the small write command to buffer522.
As a further example, the control circuitry can determine whether to perform buffering based at least in part on the amount of available space in memory. For instance, if the amount of available space in memory is below a pre-determined threshold, the control circuitry can selectpath504 and bypass buffering. On the other hand, if the amount of available space is above the pre-determined threshold, the control circuitry can selectpath520, and buffer a small write command.
As yet a further example, the control circuitry may use past performance of the I/O scheduling algorithm to determine whether or not to buffer. For instance, the control circuitry can maintain a scorecard of past performance of the scheduling algorithm, which can include the success rate of past buffering decisions. If the control circuitry determines that the success rate is relatively low, the control circuitry can adapt the heuristics used in determining whether or not to buffer.
As another example, the control circuitry can determine whether or not to buffer based on the size of the buffer, or, particularly, the size of the buffer relative to the size of a write command. For example, if the size of the write command is the same as or larger than the size ofbuffer522, the control circuitry can determine not to transfer the write command to buffer522, and instead select to dispatch the write command directly to NVM510 viapath504.
If, however, the size of the buffer is greater than the size of the write command, the control circuitry can apply one or more algorithms in order to reduce fragmentation across dies for a particular write span. In particular, an assumption can be made that user data is generally read from a NVM with the same granularity as the granularity with which the user data was programmed (e.g., along similar boundaries). For example, a system may save a text document to a NVM. Subsequently, the system will likely load the entire document as well. As another example, when a system programs a database in a NVM, the database is programmed using small, random updates. Likewise, at a later time, the database will likely be accessed using random reads. As such, the system can expect that user data associated with a sequence of small write commands will be read simultaneously.
Conventionally, a system may separately program each write command to a NVM. As a result, the user data associated with the write commands may be split among multiple dies, which can increase the energy (e.g., the overall power consumption) that it will take to read the user data at a later time. This is because each die needs to be activated in order to read the user data.
Accordingly, in order to maximize concurrency (e.g., minimize channel doubling) and minimize read energy, the control circuitry can attempt to re-order the write commands in a queue along page-aligned boundaries of a buffer. In particular, the control circuitry can attempt to select one or more commands from a queue in order to fill a buffer along one or more page-aligned boundaries.
Thus, under such a scenario, the system may take page alignment into consideration when storing write commands into a buffer. For example, as shown inFIG. 6,buffer600 includes 8 page-aligned portions (e.g., page-aligned portions0-7), where each page-aligned portion includes space to hold four LBAs. Each page-aligned portion ofbuffer600 may be configured to store the same amount of data as a page of a NVM (e.g.,NVM300 ofFIG. 3 or NVM510 ofFIG. 5). Persons skilled in the art will appreciate thatbuffer600 may include any suitable number of page-aligned portions.
For example, as shown, page-alignedportions0 and1 ofbuffer600 have been completely filled with write commands, and page-alignedportion2 ofbuffer600 has only been partially filled. In particular, write commands0-3 are stored in page-alignedportion0, write commands4-7 are stored in page-alignedportion1, and write commands8 and9 are stored in page-alignedportion2. The remaining portions ofbuffer600 are empty.
Based on this information, the control circuitry can determine that the write commands stored inbuffer600 are not currently aligned with at least one page boundary (e.g., page-alignedportion2 has not been completely filled). In response to determining that the write commands are not aligned inbuffer600, the control circuitry can detect a fill size that is needed to fillbuffer600 to at least one page boundary. For page-alignedportion2, for instance, the control circuitry can determine that the fill size is the size of two LBAs (e.g.,sections602 and604).
The control circuitry can then determine if there is at least one write command in a queue (e.g.,queue502 ofFIG. 5) that has a size equal to the fill size. For example, for page-alignedportion2 ofbuffer600, the control circuitry can select one write command with a LBA size of two or two write commands each with a LBA size of one. Persons skilled in the art will appreciate that any suitable algorithm can be used to determine if there is at least one write command in a queue that has a size equal to the fill size such as, for example, a best-fit algorithm. Persons skilled in the art will also appreciate that if there are multiple write commands in a queue that satisfy the fill size, the control circuitry can use one or more of the heuristics discussed above to select write command(s) for filling the buffer.
Thus, if the control circuitry determines that there is at least one write command that has a size equal to the fill size, the control circuitry can transfer the at least one write command from the queue to buffer600. In the example shown inFIG. 6, for instance, the control circuitry can completely fill page-alignedportion2 of buffer600 (e.g., fillsections602 and604 with one or more write commands).
Oncebuffer600 has been filled to at least one page-aligned portion (e.g., the write commands stored inbuffer600 is aligned with page-aligned portion2), the control circuitry can combine the commands into a write-multi command. After generating the write-multi command, the control circuitry can direct a bus controller to dispatch the write-multi command frombuffer600 to a NVM.
In some embodiments, if the write commands stored inbuffer600 are encrypted commands, the control circuitry can first direct an encryption module to encrypt the at least one command prior to transferring the command from a queue to buffer600. After the command has been encrypted, the control circuitry can store the at least one encrypted command inbuffer600.
As discussed above, the control circuitry may receive one or more side-band signals (e.g., synchronized cache or FUA signals) before a page-aligned portion ofbuffer600 has been completely filled. In such a scenario, the control circuitry can fill buffer600 to the end of the page-aligned portion (e.g., page-aligned portion2) with invalid data (e.g., fillsections602 and604 with invalid data). The control circuitry can then combine the write commands stored inbuffer600 with the invalid data into a write-multi command, and direct the bus controller to dispatch the write-multi command frombuffer600 to the NVM.
Accordingly, by optimally re-ordering write commands in a queue and fitting one or more of these write commands into page-aligned portions in a buffer, the system can minimize read energy on the read path, minimize buffering overhead, and reduce fragmentation in a NVM.
Referring back toFIG. 5, instead of selectingpath504 orpath520, a control circuitry can combine buffered write commands and queue write commands into the same batch or transaction (e.g., as a write-multi command) viapath530. For example, the control circuitry can select at least a first write command fromqueue502 and at least a second write command frombuffer522, where the first and second write commands may be selected to minimize page crossings in NVM510. After selecting the first and second write commands, the control circuitry can combine the commands to form a write-multi command. Finally, the control circuitry can direct a bus controller to dispatch the write-multi command to the NVM viapath530.
In some cases, in combining write commands from a buffer with write commands from a queue, the control circuitry may need to combine multiple types of encryption characteristics in a single transaction. For instance, the control circuitry can apply at least a first encryption seed to the one or more write commands selected fromqueue502. In addition, if write commands stored inbuffer522 are unencrypted commands, the control circuitry can retrieve one or more encryption seeds from memory (e.g., memory114 ofFIG. 1). After retrieving the one or more encryption seeds, the control circuitry can apply the one or more encryption seeds to the one or more write commands selected frombuffer522. The control circuitry can then combine the first and second encrypted write commands into a write-multi command, and dispatch the write-multi command to NVM510.
Referring now toFIGS. 7A-7B and8, flowcharts of illustrative processes are shown in accordance with various embodiments of the invention. These processes may be executed by one or more components in a system (e.g., electronic device100 ofFIG. 1 orelectronic device200 ofFIG. 2). For example, at least some of the steps in the processes ofFIGS. 7A-7B and8 may be performed by control circuitry (e.g., SoC control circuitry112 ofFIG. 1). Persons skilled in the art will appreciate that one or more of the steps may be performed by a NVM interface (e.g., NVM interface118 ofFIG. 1 orNVM interface218 ofFIG. 2). Dashed boxes are meant to illustrate steps that may or may not be performed based on the configuration of the system.
Turning first toFIGS. 7A and 7B,process700 is shown for efficient buffering in a NVM (e.g.,NVM120 ofFIG. 1,NVM220 ofFIG. 2,NVM300 ofFIG. 3, or NVM510 ofFIG. 5).Process700 may begin atstep702, and at step704, the control circuitry can receive a first write command from a file system (e.g.,file system210 ofFIG. 2) for programming to a NVM.
Atstep706, the control circuitry can save the first write command in a queue (e.g.,queue502 ofFIG. 5). Continuing to step708, the control circuitry can apply one or more heuristics to the first write command in order to determine whether to transfer the first write command to a buffer (e.g., buffer522 ofFIG. 5 or buffer600 ofFIG. 6). That is, the control circuitry can determine whether to perform buffering of the first write command.
If, atstep710, the control circuitry determines not to transfer the first write command,process700 may move to step712. Atstep712, the control circuitry can direct an encryption module (e.g., encryption module116 ofFIG. 1) to encrypt the first write command. Then, atstep714, the control circuitry can direct a bus controller (e.g.,NVM bus controller216 ofFIG. 2) to dispatch the encrypted write command to the NVM (e.g., viapath504 ofFIG. 5). For example, the control circuitry may be configured to direct the bus controller to dispatch the encrypted write command to a hardware engine (e.g.,hardware engine506 ofFIG. 5).Process700 may then end atstep716.
If, atstep710, the control circuitry instead determines to transfer the first write command,process700 may move to step718 or, alternatively, to step720. In particular, if the system is configured such that commands are encrypted prior to storage in a buffer, then, atstep718, the control circuitry can direct an encryption module to encrypt the first write command. On the other hand, if the system is configured such that unencrypted commands are stored in the buffer,process700 may directly move to step720. In such a configuration, the control circuitry can also copy one or more encryption seeds to memory (e.g., memory114 ofFIG. 1). Thus, atstep720, the control circuitry can store either the encrypted command or the unencrypted command in the buffer (e.g., viapath520 ofFIG. 5).
Continuing to step722, the control circuitry can determine if the write commands stored in the buffer are currently aligned with at least one page boundary. If, atstep722, the control circuitry determines that the write commands stored in the buffer are currently aligned with at least one page boundary,process700 may move to step724 orstep726. That is, if unencrypted commands are stored in the buffer, atstep724, the control circuitry can direct an encryption module to encrypt the write commands using one or more stored encryption seeds. If, however, encrypted commands are stored in the buffer,process700 may move directly to step726.
Atstep726, the control circuitry can combine the write commands in the buffer into a write-multi command. Then, atstep728, the control circuitry can direct a bus controller to dispatch the write-multi command from the buffer to the NVM (e.g., viapath524 ofFIG. 5).Process700 may then end atstep716.
If, atstep722, the control circuitry instead determines that the write commands stored in the buffer are not currently aligned with at least one page boundary,process700 may move to step730. Atstep730, the control circuitry can detect a fill size that is needed to fill the buffer to the at least one page boundary.
Continuing to step732, the control circuitry can determine if there is at least one write command in the queue that has a size equal to the fill size. If atstep732, the control circuitry determines that there is at least one write command that has a size equal to the fill size,process700 may move to step734.
Atstep734, the control circuitry can transfer the at least one write command from the queue to the buffer.Process700 may then return to step724 or step726 (e.g., depending on whether or not the write commands in the buffer are encrypted).
If, atstep732, the control circuitry instead determines that there is not at least one write command in the queue that has a size equal to the fill size,process700 may move to step736. Atstep736, the control circuitry can wait for additional write commands to arrive (e.g., wait to receive additional write commands from a file system).Process700 may then end atstep716.
Turning now toFIG. 8,process800 is shown for determining whether or not to buffer in a NVM (e.g.,NVM120 ofFIG. 1,NVM220 ofFIG. 2,NVM300 ofFIG. 3, or NVM510 ofFIG. 5). In some embodiments,process800 may represent a more detailed view of the applying at least oneheuristic step708 and the determining whether to transferstep710 of process700 (FIG. 7).
Process800 may begin atstep802, and atstep804, the control circuitry can determine the size of a write command saved in a queue (e.g.,queue502 ofFIG. 5).
Then, atstep806, the control circuitry can determine whether the size of the write command is below a pre-determined threshold. If, atstep806, the control circuitry determines that the size of the write command is not below the pre-determined threshold (e.g., the write command is a large write command),process800 may move to step808.
Atstep808, the control circuitry can direct a bus controller to dispatch the write command from the queue to a NVM (e.g., viapath504 ofFIG. 5).Process800 may then end atstep810.
If, atstep806, the control circuitry instead determines that the size of the write command is below a pre-determined threshold (e.g., the write command is a small write command),process800 may move to step812.
Atstep812, the control circuitry can determine whether the number of write commands saved in the queue exceeds a pre-determined number. If, atstep812, the control circuitry determines that the number of write commands saved in the queue exceeds the pre-determined number,process800 may move to step814.
Atstep814, the control circuitry can combine the write commands into a write-multi command.Process800 may then move to step808, where the write-multi command can be dispatched from the queue to the NVM.
If, atstep812, the control circuitry instead determines that the number of write commands saved in the queue does not exceed the pre-determined number,process800 may move to step816.
Atstep816, the control circuitry can determine whether a set of most recently received write commands in the queue is sequential. If, atstep816, the control circuitry determines that the set of most recently received write commands is sequential,process800 may move to step818.
Atstep818, the control circuitry can wait to receive additional write commands in the queue. Then, atstep820, the control circuitry can determine if at least one additional write command has been received from a file system (e.g.,file system210 ofFIG. 2). For example, the control circuitry can make this determination after a short period of time (e.g., a few milliseconds).
If, atstep820, the control circuitry determines that it has received at least one additional write command from the file system,process800 may move to step814, where the write commands can be combined into a write-multi command.
If, atstep820, the control circuitry instead determines that it has not received at least one additional write command from the file system,process800 may move to step822. Atstep822, the control circuitry can transfer the write command to a buffer (e.g., viapath520 ofFIG. 5).Process800 may then end atstep810.
If, atstep816, the control circuitry instead determines that a set of most recently received write commands in the queue are not sequential,process800 may move to step824. Atstep824, the control circuitry can determine whether the number of dispatched write commands immediately preceding the write command exceeds a pre-determined number.
If, atstep824, the control circuitry determines that the number of dispatched write commands exceeds a pre-determined number,process800 may move to step818, where the control circuitry can wait to receive additional commands in the queue. If, atstep824, the control circuitry instead determines that the number of dispatched write commands does not exceed a pre-determined number,process800 may move to step822, where the control circuitry can transfer the write command to the buffer.
It should be understood that processes700 and800 ofFIGS. 7A-7B and8 are merely illustrative. Any of the steps may be removed, modified, or combined, and any additional steps may be added, without departing from the scope of the invention.
The described embodiments of the invention are presented for the purpose of illustration and not of limitation.