Movatterモバイル変換


[0]ホーム

URL:


CN119718727A - Cache unit management method and storage device - Google Patents

Cache unit management method and storage device
Download PDF

Info

Publication number
CN119718727A
CN119718727ACN202311255502.4ACN202311255502ACN119718727ACN 119718727 ACN119718727 ACN 119718727ACN 202311255502 ACN202311255502 ACN 202311255502ACN 119718727 ACN119718727 ACN 119718727A
Authority
CN
China
Prior art keywords
command processing
processing completion
completion message
storage
storage command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311255502.4A
Other languages
Chinese (zh)
Inventor
王玉巧
兰彤
黄好城
刘传杰
李正审
蔡德卿
肖峰
聂鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Starblaze Technology Co ltd
Original Assignee
Beijing Starblaze Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Starblaze Technology Co ltdfiledCriticalBeijing Starblaze Technology Co ltd
Priority to CN202311255502.4ApriorityCriticalpatent/CN119718727A/en
Publication of CN119718727ApublicationCriticalpatent/CN119718727A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

Translated fromChinese

本申请实施例提供了一种缓存单元管理方法及存储设备,涉及存储技术领域。该方法包括:响应于接收到NVMe读命令,将NVMe读命令指示的第一地址定位信息添加到分配的缓存单元以及记录已收到的第一地址定位信息对应的存储命令处理完成消息的第一次数;响应于接收到存储命令处理完成消息,根据存储命令处理完成消息指示的第二地址定位信息被使用的预期次数和记录的已收到第二地址定位信息对应的存储命令处理完成消息的第二次数,确定是否释放第二地址定位信息占用的缓存单元以及确定是否存在错误的存储命令处理完成消息。该方法能够及时释放地址定位信息占用的缓存单元,同时在释放缓存单元的过程中检测是否存在错误的存储命令处理完成消息。

The embodiment of the present application provides a cache unit management method and storage device, which relates to the field of storage technology. The method includes: in response to receiving an NVMe read command, adding the first address location information indicated by the NVMe read command to the allocated cache unit and recording the first number of storage command processing completion messages corresponding to the first address location information that have been received; in response to receiving a storage command processing completion message, according to the expected number of times the second address location information indicated by the storage command processing completion message is used and the second number of storage command processing completion messages corresponding to the second address location information that have been recorded, determining whether to release the cache unit occupied by the second address location information and determining whether there is an erroneous storage command processing completion message. The method can release the cache unit occupied by the address location information in a timely manner, and at the same time detect whether there is an erroneous storage command processing completion message in the process of releasing the cache unit.

Description

Cache unit management method and storage device
Technical Field
The present application relates to the field of storage technologies, and in particular, to a cache unit management method and a storage device.
Background
FIG. 1A illustrates a block diagram of a solid state storage device. The solid state storage device 102 is coupled to a host for providing storage capability for the host. The host and solid state storage device 102 may be coupled by a variety of means including, but not limited to, connecting the host to the solid state storage device 102 via, for example, SATA (SERIAL ADVANCED Technology Attachment ), SCSI (Small Computer system interface), SAS (SERIAL ATTACHEDSCSI ), IDE (INTEGRATED DRIVE Electronics, integrated drive Electronics), USB (Universal Serial Bus ), PCIE (PERIPHERAL COMPONENT InterconnectExpress, PCIe, high speed peripheral component interconnect), NVMe (NVM Express, high speed nonvolatile storage), ethernet, fibre channel, wireless communication network, and the like. The host may be an information processing device capable of communicating with the storage device in the manner described above, such as a personal computer, tablet, server, portable computer, network switch, router, cellular telephone, personal digital assistant, or the like. The memory device 102 (hereinafter, solid-state memory device will be simply referred to as memory device) includes an interface 103, a control unit 104, one or more NVM chips 105, and a DRAM (Dynamic RandomAccess Memory ) 110.
The NVM chip 105 described above includes NAND flash memory, phase change memory, feRAM (Ferroelectric RAM, ferroelectric memory), MRAM (Magnetic Random Access Memory, magnetoresistive memory), RRAM (RESISTIVE RANDOMACCESS MEMORY, resistive memory), and the like, which are common storage mediums.
The interface 103 may be adapted to exchange data with a host by means of, for example SATA, IDE, USB, PCIE, NVMe, SAS, ethernet, fibre channel, etc.
The control unit 104 is used for controlling data transmission among the interface 103, the NVM chip 105 and the DRAM110, and also for memory management, host logical address to flash physical address mapping, erase balancing, bad block management, etc. The control component 104 can be implemented in a variety of ways, such as software, hardware, firmware, or a combination thereof, for example, the control component 104 can be in the form of an FPGA (Field-programmable gate array) GATE ARRAY, an ASIC (Application SpecificIntegrated Circuit, application-specific integrated circuit), or a combination thereof. The control component 104 may also include a processor or controller in which software is executed to manipulate the hardware of the control component 104 to process IO (Input/Output) commands. Control unit 104 may also be coupled to DRAM110 and may access data of DRAM 110. FTL tables and/or cached data of IO commands may be stored in the DRAM.
The control section 104 issues a command to the NVM chip 105 in a manner conforming to the interface protocol of the NVM chip 105 to operate the NVM chip 105, and receives a command execution result output from the NVM chip 105. Known NVM chip interface protocols include "Toggle", "ONFI", and the like.
The memory Target (Target) is one or more logical units (LUNs, logic UNit) of a shared CE (Chip Enable) signal within the NAND flash package. One or more dies (Die) may be included within the NAND flash package. Typically, the logic unit corresponds to a single die. The logic cell may include multiple planes (planes). Multiple planes within a logic unit may be accessed in parallel, while multiple logic units within a NAND flash memory chip may execute commands and report status independently of each other. Data is typically stored and read on a storage medium on a page basis. While data is erased in blocks. A block (also called a physical block) contains a plurality of pages. A block contains a plurality of pages. Pages on a storage medium (called physical pages) have a fixed size, e.g., 17664 bytes. The physical pages may also have other sizes.
FTL (Flash Translation Layer ) is utilized in the storage device 102 to maintain mapping information from logical addresses (LBAs) to physical addresses. The logical addresses constitute the storage space of the solid state storage device as perceived by upper level software such as the operating system. The physical address is an address for accessing a physical storage unit of the solid state storage device. Address mapping may also be implemented in the related art using an intermediate address modality. For example, logical addresses are mapped to intermediate addresses, which in turn are further mapped to physical addresses. The table structure storing mapping information from logical addresses to physical addresses is called FTL table. FTL tables are important metadata in a storage device. The data items of the FTL table record address mapping relations in units of data units in the storage device.
The host accesses the storage device in IO commands that follow the storage protocol. The control component generates one or more media interface commands based on the IO commands from the host and provides the media interface commands to the media interface controller. The media interface controller generates storage media access commands (e.g., program commands, read commands, erase commands) that follow the interface protocol of the NVM chip according to the media interface commands. The control unit also keeps track of all media interface commands generated from one IO command being executed and indicates to the host the result of processing the IO command.
Referring to fig. 1B, the control part includes a host interface 1041, a host command processing unit 1042, a storage command processing unit 1043, a medium interface controller 1044, and a storage medium management unit 1045. The host interface 1041 acquires an IO command provided by the host. The host command processing unit 1042 generates a storage command from the IO command and supplies the storage command to the storage command processing unit 1043. The store commands may access the same size of memory space, e.g., 4KB. The data unit of the data accessed by the corresponding one of the storage commands recorded in the NVM chip is referred to as a data frame. The physical page records one or more frames of data. For example, if the physical page size is 17664 bytes and the data frame size is 4KB, then one physical page can store 4 data frames.
The storage medium management unit 1045 maintains a logical address to physical address conversion for each storage command. For example, the storage medium management unit 1045 includes FTL tables (FTL will be explained later). For a read command, the storage medium management unit 1045 outputs a physical address corresponding to a logical address (LBA) accessed by the storage command. For a write command, the storage medium management unit 1045 allocates an available physical address thereto, and records a mapping relationship of a logical address (LBA) to which it accesses and the allocated physical address. The storage medium management unit 1045 also maintains functions required to manage NVM chips, such as garbage collection, wear leveling, and the like.
The storage command processing unit 1043 operates the medium interface controller 1044 to issue a storage medium access command to the NVM chip 105 according to the physical address supplied from the storage medium management unit 1045.
For purposes of clarity, commands sent by the host to the storage device 102 are referred to as IO commands (including, for example, NVMe read commands, NVMe write commands), commands sent by the host command processing unit 1042 to the storage command processing unit 1043 are referred to as storage commands, commands sent by the storage command processing unit 1043 to the media interface controller 1044 are referred to as media interface commands, and commands sent by the media interface controller 1044 to the NVM chip 105 are referred to as storage media access commands. The storage medium access command follows the interface protocol of the NVM chip.
In the NVMe protocol, there are two commands, an admin command, for a host to manage and control a storage device, and an IO command, including an NVMe write command and an NVMe read command, for controlling data transmission between the host and the storage device. After receiving the NVMe write command, the solid-state storage device 102 obtains data from the memory of the host through the host interface 1041, and writes the data into the flash memory. For NVMe read commands, after data is read from flash memory, the solid state storage device 102 moves the data into host memory through the host interface 1041. PRP (PhysicalRegion Page, physical area page) and SGL (Scatter gather list) are two ways of describing data to be transferred between a host and a storage device. In the PRP approach, several PRP entries are used that are linked together, each PRP entry including a 64-bit memory physical address describing a physical Page (Page) of host memory. The SGL is a linked list of one or more SGL segments, each SGL segment in turn being composed of one or more SGL descriptors, each SGL descriptor describing the address and length of the data cache of the host memory, i.e., each SGL descriptor corresponds to a host memory address space, each SGL descriptor having, for example, a fixed size (e.g., 16 bytes).
The fields in NVMe write commands and NVMe read commands related to SGL or PRP indicate the location of data in host memory (for write commands) or the host memory address that needs to be written (for read commands). The PRP field or SGL field in the NVMe write command and the NVMe read command may be an SGL or PRP entry pointing to the host memory address space to be accessed, or may be a pointer pointing to an SGL or PRP linked list, or may even be a pointer of a pointer. If the NVMe write command and the NVMe read command carry SGL or PRP, the storage device may directly acquire SGL or PRP in response to receiving the IO command. If the NVMe write command and the NVMe read command carry SGL or PRP pointers, in response to receiving the IO command, the storage device accesses the host according to the SGL or PRP pointers, and obtains the SGL or PRP from the host. A host command processing unit that processes NVMe read commands is provided in chinese patent 202110746144.1, and a host command processing unit that processes NVMe write commands is provided in chinese patent 202110746142.2, in which a process of a storage device acquiring PRP entries or SGL descriptors from NVMe write commands or NVMe read commands is described.
In high performance storage devices, the control unit may process hundreds or thousands of IO commands (including NVMe read commands, NVMe write commands) per second, each carrying a PRP field or SGL field. However, the resources of the buffer unit storing the PRP field or the SGL field are limited, and in order for the buffer unit to be able to buffer the PRP or SGL transmitted from the host to the storage device, it is necessary to manage the resources of the buffer unit.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method and a storage device for managing a cache unit, which can timely release a cache unit occupied by address location information indicated by an IO command (for example, a PRP entry, an SGL descriptor, or a table according to a PRP entry or an SGL descriptor), so that the cache unit has sufficient space to cache the address location information indicated by a subsequent IO command, thereby improving the utilization rate of the cache unit, and simultaneously detecting whether there is an erroneous storage command processing completion message, so as to determine whether a received NVMe read command is correctly processed.
According to a first aspect of the present application, there is provided a cache unit management method, including, in response to receiving an NVMe read command, adding first address location information indicated by the NVMe read command to an allocated cache unit and recording a first number of times of a storage command processing completion message corresponding to each received first address location information;
And in response to receiving the storage command processing completion message, determining whether to release a cache unit occupied by the second address positioning information and determining whether to have an erroneous storage command processing completion message according to the expected number of times the second address positioning information indicated by the storage command processing completion message is used and the recorded second number of times the storage command processing completion message corresponding to the second address positioning information has been received.
In an alternative embodiment, if the expected number of times the second address location information is used is equal to the sum of the value of the second number of times and 1, releasing the buffer unit occupied by the second address location information, if the expected number of times the second address location information is used is greater than the sum of the value of the second number of times and 1, not releasing the buffer unit occupied by the second address location information, controlling the value of the second number of times to be added with 1, and if the expected number of times the second address location information is used is less than the sum of the value of the second number of times and 1, determining that an error storage command processing completion message exists.
In an alternative embodiment, the method further includes recording a first number of times of the received storage command processing completion message corresponding to one of the first address location information with a 1-bit memory in case that the expected number of times of using each of the first address location information indicated by the NVMe read command is not more than 2.
In an alternative embodiment, if the expected number of times the second address location information is used is equal to the sum of the value of the second number of times and 1, releasing the cache unit occupied by the second address location information includes releasing the cache unit occupied by the second address location information if the expected number of times the second address location information is used is1, the value of the second number of times is 0.
In an alternative embodiment, if the expected number of times the second address location information is used is equal to the sum of the value of the second number of times and 1, releasing the cache unit occupied by the second address location information includes releasing the cache unit occupied by the second address location information if the expected number of times the second address location information is used is 2, the value of the second number of times is 1.
In an alternative embodiment, if the expected number of times the second address location information is used is less than the sum of the value of the second number of times and 1, the cache unit occupied by the second address location information is not released, including if the expected number of times the second address location information is used is 2, the value of the second number of times is 0, and the cache unit occupied by the second address location information is not released.
In an alternative embodiment, in the case of releasing the cache unit occupied by the second address location information, the method further includes clearing the second number of times corresponding to the second address location information recorded in the memory.
In an alternative embodiment, the method further comprises determining whether there is an error in the received storage command processing completion message corresponding to the NVMe read command according to a first number of storage command processing completion messages expected to be received by the NVMe read command and a second number of received second storage command processing completion messages corresponding to the NVMe read command.
In an alternative embodiment, if the second number is greater than the first number, it is determined that there is an error in the storage command processing completion message corresponding to the received NVMe read command.
In an alternative embodiment, if the first number is equal to the second number, determining whether an error exists in the storage command processing completion message corresponding to the received NVMe read command according to a first logical address corresponding to the storage command processing completion message expected to receive the first address location information and a second logical address corresponding to the received storage command processing completion message indicating the first address location information.
In an alternative embodiment, if there is at least one first logical address corresponding to the first address location information that is different from the second logical address, it is determined that an error exists in the storage command processing completion message corresponding to the received NVMe read command.
In an optional embodiment, the method further includes determining that an erroneous storage command processing completion message exists if the first delay time corresponding to the NVMe read command is greater than or equal to a specified value and the cache unit occupied by the first address location information corresponding to the NVMe read command is not completely released.
In an alternative embodiment, the method further comprises recording a first timestamp of the received NVMe read command, and calculating a first difference between the current time and the first timestamp, wherein the first difference is used as a first delay time corresponding to the NVMe read command.
In an alternative embodiment, the method further comprises the steps of responding to the received multiple storage command processing completion messages, sequentially recording second time stamps of all storage command processing completion messages in the received multiple storage command processing completion messages, calculating second difference values between the current time and the second time stamps corresponding to all storage command processing completion messages, and determining the processing sequence of the multiple storage command processing completion messages according to the second difference values.
In an alternative embodiment, determining the processing order of the plurality of storage command processing completion messages according to the second difference value includes sequentially processing the plurality of received storage command processing completion messages in an order from the larger to the smaller second difference value.
In an alternative embodiment, the method further comprises generating a DMA command according to the first storage command processing completion message in response to processing the first storage command processing completion message, and moving data from the storage device cache to the host according to the DMA command.
In an alternative embodiment, the method further comprises storing information of a second storage command corresponding to the second storage command processing completion message into metadata of the storage device in response to a second difference value corresponding to the second storage command processing completion message being greater than a preset threshold.
In an alternative embodiment, the method further comprises, in response to receiving a custom command sent by the host that satisfies the NVMe protocol, reading information of a second storage command stored in metadata of the storage device according to a logical address indicated by the custom command.
According to a second aspect of the present application, there is provided a storage device comprising a host command processing unit comprising a cache unit, a circuit to write to the cache unit, a circuit to release the cache unit, the circuit to release the cache unit comprising a first memory:
the circuit of the write cache unit is used for responding to the received NVMe read command and adding each piece of first address positioning information indicated by the NVMe read command to the allocated cache unit;
the first memory is used for recording the first times of the received storage command processing completion messages corresponding to the first address positioning information;
and the circuit of the release cache unit is used for determining whether to release the cache unit occupied by the second address positioning information and determining whether to have an error storage command processing completion message according to the expected times of the second address positioning information indicated by the storage command processing completion message and the recorded second times of the storage command processing completion message corresponding to the received second address positioning information.
In an alternative embodiment, the circuit for releasing the buffer unit is configured to release the buffer unit occupied by the second address location information if the expected number of times the second address location information is used is equal to the sum of the value of the first memory corresponding to the second address location information and 1, not release the buffer unit occupied by the second address location information if the expected number of times the second address location information is used is greater than the sum of the value of the first memory corresponding to the second address location information and 1, and determine that there is an erroneous storage command processing completion message if the expected number of times the second address location information is used is less than the sum of the value of the first memory corresponding to the second address location information and 1.
In an alternative embodiment, the first memory is a 1-bit memory in case that the expected number of times the respective first address location information indicated by the NVMe read command is used is not more than 2.
In an alternative embodiment, the circuit for releasing the cache unit is further configured to release the cache unit occupied by the second address location information if the expected number of times the second address location information is used is 1 and the value of the second number of times is 0.
In an alternative embodiment, the circuit for releasing the cache unit is further configured to release the cache unit occupied by the second address location information if the expected number of times the second address location information is used is 2 and the value of the second number of times is 1.
In an alternative embodiment, the circuit for releasing the cache unit is further configured to not release the cache unit occupied by the second address location information if the expected number of times the second address location information is used is 2 and the value of the second number of times is 0.
In an alternative embodiment, in a case of releasing the cache unit occupied by the second address location information, the host instructs the processing unit to empty the second number of times corresponding to the second address location information recorded in the memory.
In an alternative embodiment, the host command processing unit is further configured to determine whether an error exists in the storage command processing completion message corresponding to the received NVMe read command according to a first number of storage command processing completion messages expected to be received by the NVMe read command and a second number of received second storage command processing completion messages corresponding to the NVMe read command.
In an alternative embodiment, the host command processing unit further includes a second memory to record a first number of storage command processing completion messages that the NVMe read command is expected to receive, and a third memory to record a second number of second storage command processing completion messages that have been received that correspond to the NVMe read command.
In an alternative embodiment, the host command processing unit is further configured to determine that an error exists in the storage command processing completion message corresponding to the received NVMe read command if the second number is greater than the first number.
In an alternative embodiment, the host command processing unit is further configured to determine whether an error exists in the storage command processing completion message corresponding to the received NVMe read command, where the first number is equal to the second number, and the first logical address corresponding to the storage command processing completion message expects to receive the first address location information and the second logical address corresponding to the received storage command processing completion message indicating the first address location information.
In an optional embodiment, the host command processing unit is further configured to determine that an error exists in the storage command processing completion message corresponding to the received NVMe read command if at least one first logical address corresponding to the first address location information is different from the second logical address.
In an optional embodiment, the host command processing unit is further configured to determine that an erroneous storage command processing completion message exists if the first delay time corresponding to the NVMe read command is greater than or equal to a specified value, and the cache unit occupied by the first address location information corresponding to the NVMe read command is not completely released.
In an alternative embodiment, the host command processing unit further includes a read initiating circuit, the read initiating circuit includes an information buffer, the information buffer records a first timestamp of the received NVMe read command, and the host command processing unit is further configured to calculate a first difference between a current time and the first timestamp, where the first difference is used as a first delay time corresponding to the NVMe read command.
In an alternative embodiment, the host command processing unit responds to receiving a plurality of storage command processing completion messages, sequentially records a second time stamp of each storage command processing completion message in the received plurality of storage command processing completion messages, calculates a second difference value between the current time and the second time stamp corresponding to each storage command processing completion message, and determines the processing sequence of the plurality of storage command processing completion messages according to the second difference value.
In an alternative embodiment, the host command processing unit sequentially processes the received plurality of storage command processing completion messages in order of the second difference value from the large value to the small value.
In an alternative embodiment, the host command processing unit is further configured to, in response to processing the first storage command processing completion message, generate a DMA command according to the first storage command processing completion message, and move data from the storage device cache to the host according to the DMA command.
In an alternative embodiment, the host command processing unit is further configured to save information of a second storage command corresponding to the second storage command processing completion message to metadata of the storage device in response to a second difference value corresponding to the second storage command processing completion message being greater than a preset threshold.
In an alternative embodiment, the host command processing unit is further configured to, in response to receiving a custom command sent by the host that satisfies the NVMe protocol, read information of a second storage command stored in metadata of the storage device according to a logical address indicated by the custom command.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
FIG. 1A illustrates a block diagram of a solid state storage device;
FIG. 1B illustrates a block diagram of control components in a solid state storage device;
FIG. 2 is a schematic diagram of a host command processing unit according to an embodiment of the application;
FIG. 3 is a schematic diagram of a host command processing unit according to another embodiment of the present application;
FIG. 4 is a schematic diagram illustrating the organization of data of a memory device on physical pages of an NVM chip according to one embodiment of the present application;
FIG. 5 shows a schematic diagram of physical pages of a memory device according to another embodiment of the application;
FIG. 6 is a schematic diagram illustrating the movement of data from a storage device to a host in accordance with an embodiment of the present application;
FIG. 7A is a diagram showing data to be moved indicated by an NVMe read command according to an embodiment of the present application;
FIG. 7B is a diagram illustrating the host memory address indicated by the NVMe read command according to an embodiment of the present application;
FIG. 7C is a schematic diagram showing data indicated by an NVMe read command stored in a host memory according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a circuit for releasing a cache unit according to an embodiment of the application;
FIG. 9 is a schematic diagram of a circuit for releasing a buffer unit according to another embodiment of the present application;
FIG. 10A is a diagram illustrating a host memory address indicated by an NVMe read command according to an embodiment of the present application;
FIG. 10B is a schematic diagram showing data indicated by an NVMe read command being stored in a host memory according to an embodiment of the present application;
FIG. 11 is a flowchart illustrating a method for managing cache units of a storage device according to an embodiment of the present application;
FIG. 12 is a schematic diagram of a circuit for releasing a buffer unit according to an embodiment of the present application;
FIG. 13 is a schematic diagram of another circuit for releasing a buffer unit according to an embodiment of the present application;
Fig. 14 is a flowchart of a cache unit management method of a storage device according to another embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
For ease of understanding, a brief description of host and storage device movement of data through PRP entries or SGL descriptors follows.
If the data is moved from the storage device to the host, the host sends an NVMe read command to the storage device. The NVMe read command sent by the host to the storage device indicates the address of the host memory to write to, and also contains the starting logical address (LBA) of the storage device to access, as well as the data length. And the storage equipment searches the FTL table according to the LBA, finds the corresponding physical address, and obtains the data from the physical block corresponding to the physical address. The address indicated by the NVMe read command to write data to host memory may be indicated by a PRP entry or SGL descriptor. The NVMe read command indicates one or more RPR entries or SGL descriptors.
If the data is moved from the host to the storage device, the host sends an NVMe write command to the storage device. The NVMe write command sent by the host to the storage device indicates the length of the data to be written and the location in host memory where the data to be written is stored, which may be indicated by a PRP entry or SGL descriptor. The NVMe write command may indicate one or more RPR entries or SGL descriptors. The storage locations of the data to be written in the storage device are allocated by the storage device.
The host carries a PRP or SGL related field in the NVMe read command or NVMe write command, which may be the SGL or PRP itself, pointing to the host memory address space to be accessed, or may be a pointer, pointing to an SGL or PRP linked list, or even may be a pointer of the pointer. Whether PRP entry or SGL descriptor, essentially describes one or more address spaces in host memory whose locations in host memory are arbitrary. The PRP entries indicate a fixed size of storage space in the host memory, i.e., the storage space indicated by different PRP entries is the same size, e.g., 4KB, 16KB, etc. The SGL descriptor indicates an indefinite size of memory in the host memory, e.g., the NVMe read command indicates three SGL descriptors (SGL 1, SGL2, and SGL 3), where SGL1 indicates a 4KB of memory, SGL2 indicates a 2KB of memory, and SGL3 indicates a 3KB of memory.
In either form, the storage device may always retrieve the corresponding SGL descriptor or PRP entry based on the NVMe read command or the NVMe write command. It will be appreciated that embodiments of the present application are not limited to transferring data between a host and a storage device via PRP entries or SGL descriptors, but may also transfer data via other similarly functioning information, which in some cases is referred to collectively as address location information indicated by an IO command.
In response to receiving an NVMe read command or an NVMe write command (collectively referred to as an IO command), the storage device first obtains a PRP or SGL according to the IO command before accessing the data. For example, all PRP entries or SGL descriptors corresponding to the IO command are acquired. Since there may be a plurality of PRP entries or SGL descriptors corresponding to the IO command, the movement of data (movement between the host and the storage device) described for each PRP entry or SGL descriptor is performed independently and in parallel. The movement of the data described by the PRP entries or SGL descriptors is thus typically not done at the same time. For this purpose, the control unit of the storage device provides a cache, in particular a cache unit located inside the control unit, to temporarily record the PRP entry or SGL descriptor corresponding to the IO command. However, the cache unit inside the control unit is extremely limited and precious, and the PRP entries or SGL descriptors corresponding to the IO commands in the cache unit need to be released as soon as possible to provide the buffering capability for the PRP entries or SGL descriptors of other IO commands.
Taking the storage device processing the NVMe write command, the NVMe write command uses PRP to describe the host memory address as an example, the process of the storage device processing the NVMe write command is described. FIG. 2 is a schematic diagram of a host command processing unit according to an embodiment of the application. As shown in fig. 2, the host command processing unit includes a PRP circuit, a write initiation circuit, a DMA transfer circuit, and a buffer unit. The write initiating circuit comprises a DMA command generating circuit and a storage command generating circuit. Fig. 2 schematically illustrates an example of a write initiator circuit, and other forms of write initiator circuits may refer to chinese patent 202310788448.3, and the present application is not repeated here.
The process of the host command processing unit processing the NVMe write command includes:
The PRP circuit, in response to receiving the NVMe write command (process (1) in fig. 2), obtains the PRP entry from the NVMe write command, caches the PRP entry in the cache unit, and provides the DMA command generating circuit (process (2) in fig. 2) with the PRP index indicating the PRP entry.
The PRP index has various forms, for example, the PRP index carries a host memory address indicated in the PRP entry, or the PRP index carries a pointer of a cache unit, the cache unit stores the PRP entry, and the PRP entry is obtained from the cache unit through the PRP index, so as to obtain the host memory address. Still alternatively, the PRP index corresponds one-to-one to the PRP entry.
The DMA command generation circuit generates a DMA command based on the PRP index in response to receiving the PRP index, and supplies the generated DMA command to the DMA transfer circuit (as in process (3) in fig. 2).
A DMA command is a command for controlling a DMA transfer circuit to perform data transfer, which indicates a mapping relationship of a host memory address space and a storage device cache address space (e.g., DRAM address) for performing an operation of data movement between a host and a storage device. The host memory address indicated by the DMA command is determined from the address space indicated by the PRP and the storage device cache address is allocated by the storage device.
For NVMe write commands, the host memory address is the source address and the storage device cache address is the destination address. Generating a DMA command requires a host memory address as a source address and a storage device cache address as a destination address. The DMA command generating circuit obtains the host memory address according to the PRP index, and allocates a storage space from the storage device cache to obtain the storage device cache address.
The DMA transfer circuit moves data between the host memory and the storage device buffer according to the DMA command in response to the received DMA command, and after the data movement is completed, provides a message indicating that the DMA command processing is completed to the storage command generating circuit (as in process (4) in fig. 2).
The storage command generating circuit generates a storage command in response to receiving the DMA command processing completion message sent from the DMA transfer circuit, and sends the storage command to the storage command processing unit (as in process (5) in fig. 2). By way of example, a storage command causes a storage command processing unit to move data in a storage device cache to an NVM chip of a storage device in accordance with the storage command. The storage command processing unit and the storage command are both in the prior art, and are not described herein.
As can be seen from the above-described processes (1) - (5) shown in fig. 2, the PRP entries indicated by the NVMe write command are mainly used to generate the DMA command, and once the DMA command is generated based on a certain PRP entry indicated by the NVMe write command, the PRP entry in the cache unit can be released.
Therefore, the host command processing unit of the storage device receives the NVMe write command, allocates a buffer unit for the address location information indicated by the NVMe write command, generates the DMA command according to the address location information indicated by the NVMe write command, and releases the buffer unit occupied by the address location information indicated by the NVMe write command corresponding to the DMA command in response to generating the DMA command. In the write initiator circuit shown in fig. 2, the function of releasing the buffer unit may be implemented by the DMA command generating circuit, that is, the DMA command generating circuit releases the buffer unit occupied by the address location information indicated by the NVMe write command corresponding to the DMA command in response to generating the DMA command.
Taking the storage device processing the NVMe read command, the NVMe read command uses the PRP descriptor host memory address as an example, to describe the process of the storage device processing the NVMe read command. FIG. 3 is a schematic diagram of a host command processing unit according to an embodiment of the application. As shown in fig. 3, the host command processing unit includes a PRP circuit, a read initiation circuit, a DMA transfer circuit, and a buffer unit, wherein the read initiation circuit includes an information buffer, a DMA command generation circuit, and a completion message processing circuit. Fig. 3 schematically illustrates an example of a write initiator circuit, and other forms of write initiator circuits may refer to chinese patent 202310788454.9, and the present application is not repeated here.
The PRP circuitry is to, in response to receiving the NVMe read command, obtain a PRP entry from the NVMe read command, cache the PRP entry to the cache unit, and provide a PRP index indicating the PRP entry to the read initiate circuitry. The acquisition of PRP entries from NVMe read commands and the acquisition of PRP indexes belongs to the prior art (e.g., PRP circuit shown in chinese patent 202110746144.1), and is not described in detail herein.
The read initiate circuit updates the information cache with the PRP index in response to receiving the PRP index or the NVMe read command (as in process (1) of fig. 3). The information cache may accommodate a plurality of entries, each entry corresponding to one NVMe read command, a different entry corresponding to a different NVMe read command, all PRP entries, all DMA commands, and all DMA command processing completion messages associated with the same NVMe read command, corresponding to the same entry. Whereby the storage device concurrently processes multiple NVMe read commands simultaneously. It will be appreciated that the NVMe read commands stored in the information cache may be different in form from the NVMe read commands received from the host, but each NVMe read command occupies one entry in the information cache. For example, to read the need to initiate circuit processing, the NVMe read command in the information cache includes information such as command identification, cache unit index, DMA command count, time stamp, etc. The command identifier is used to identify a specified NVMe read command from among a plurality of NVMe read commands. The cache unit index is used to indicate PRP entries in the cache unit. The DMA command count is used to indicate the number of DMA commands that have completed processing corresponding to the NVMe read command to which it belongs. The timestamp indicates the time the NVMe read command was received.
It will be appreciated that the control unit of the storage device also performs other processing on the NVMe read command, for example, generating a physical address based on the logical address to be accessed by the NVMe read command through the FTL table, reading data from the NVM chip with the physical address, etc., these processing procedures and the unit for implementing these processing are well known in the art. The components that implement these processes are collectively referred to as a back-end module. After the data is read from the NVM chip and added to the memory device cache (e.g., DRAM), the back-end module provides a store command processing complete message to the read initiate circuit (e.g., process (2) in fig. 3). The single NVMe read command corresponds to one or more storage commands, such that the single NVMe read command corresponds to one or more storage command complete messages.
The DMA command generation circuit obtains a buffer unit index corresponding to the storage command from the information buffer (process (3) in fig. 3) in response to receiving the storage command processing completion message (process (2) in fig. 3), obtains a PRP entry corresponding to the storage command from the buffer unit based on the buffer unit index (process (4) in fig. 3), and generates a corresponding DMA command based on the PRP entry, and sends the DMA command to the DMA transfer circuit (process (5) in fig. 3).
To generate a DMA command, it is necessary to know the storage device cache address as the source address of the DMA transfer and the host memory address as the destination address. According to the embodiment of the application, the cache address of the storage device is obtained according to the storage command processing completion message, and the memory address of the host is obtained according to the PRP entry. The NVMe read command stored in the information buffer of fig. 3 includes, for example, a buffer location index, obtaining its corresponding NVMe read command according to the storage command processing completion message, obtaining the buffer location index from the information buffer, and obtaining the PRP entry from the corresponding buffer location based on the buffer location index.
The DMA transfer circuit moves data from the storage device cache to the host memory in response to receiving the DMA command, and sends a DMA command processing completion message to the completion message processing circuit in response to the DMA command processing being completed (process (6) in fig. 3).
The completion message processing circuit is coupled with the DMA transmission circuit and the cache unit, and is used for updating an NVMe read command corresponding to the DMA command processing completion message in the cache unit in response to receiving the DMA command processing completion message. For example, the DMA command count value is updated (e.g., incremented), or if the DMA command corresponding to the NVMe read command is completely processed, the NVMe read command is deleted from the cache unit (as in process (7) of fig. 3).
In the embodiment shown in fig. 3, there are two different phases in the NVMe read command processing, and the information cache of the read initiator circuit is updated. After the control unit of the storage device receives the NVMe read command and obtains the PRP entry from the host according to the NVMe read command, the control unit of the storage device adds an entry associated with the NVMe read command to the information cache, and the entry associated with the NVMe read command added to the information cache includes a PRP index indicating a cache address of the PRP entry, and the control unit of the storage device of stage (1) updates the DMA command count of the NVMe read command in the information cache or deletes the NVMe read command in response to receiving one or more DMA command processing completion messages corresponding to the NVMe read command.
FIG. 4 is a schematic diagram illustrating the organization of data of a memory device on physical pages of an NVM chip according to one embodiment of the present application. As shown in fig. 4, in an NVM chip, the memory space storing data is organized into physical blocks and physical pages. In the IO command provided to the storage device by the host, the size of the storage unit represented by the logical address LBA is, for example, 4KB, and in addition to the 4KB data provided by the user (recorded in the user data area shown in fig. 4), information generated by the storage device itself, for example, check data generated by the ECC algorithm on the user data, the LBA of the user data, a random number seed when scrambling the user data, an encryption or decryption key, and the like are recorded in the spare data area. For the sake of simplicity, the data recorded in the spare data area is also referred to as metadata. Alternatively, the size of the storage unit represented by each LBA may be 4KB, 8KB or 16KB, or the like, and the metadata size may be 512Byte or other sizes.
In the embodiment shown in FIG. 4, the physical page is slightly larger than the storage unit represented by the LBA, i.e., the physical page is slightly larger than 4KB in size. In an alternative embodiment, the storage device also provides a larger physical page (e.g., greater than 16 KB) such that the capacity of the physical page is inconsistent with the capacity of the storage unit represented by the LBA. As shown in fig. 5, metadata corresponding to 4 LBAs is recorded in the physical page, and metadata corresponding to each piece of user data is also recorded. The combination of user data stored in a physical page and its corresponding metadata is referred to as a DU (Date Unit). As shown in fig. 5, 4 DUs are recorded in a physical page. When the host reads data from the storage device, the data is moved according to the DU.
In some alternative embodiments, the size of the memory block indicated by each PRP entry (in host memory) is consistent with the size of the memory location represented by the LBA, e.g., the size of the memory block indicated by each PRP is 4KB and the size of the memory location represented by the LBA is also 4KB. In alternative embodiments, the size of the memory block (host memory) indicated by the PRP entry is not consistent with the size of the memory location represented by the LBA, e.g., the memory size indicated by the PRP entry is 4KB and the size of the memory location represented by the LBA is 2KB.
Taking as an example that the size of the memory block indicated by each PRP entry (in the host memory) is identical to the size of the memory unit represented by the LBA, for example, the size of the memory block indicated by each PRP entry is 4KB, and the size of the memory unit represented by each LBA is also 4 KB. As shown in fig. 6, the memory cell size represented by each logical address LBA indicated by the NVMe read command is, for example, 4KB, the metadata size is, for example, 512Byte, and the data size of each memory command moved from the NVM chip to the DRAM is, for example, (4kb+512byte). Each (4 kb++512 Byte) size of data stored in the DRAM is called DTU (data transfer Unit ). The DTU is moved by DMA commands to the memory block described by the PRP entry on the host side. But the memory block (in host memory) size indicated by each PRP entry is, for example, 4KB, resulting in a DTU with a data size (4kb+512 byte) that is larger than the memory location size (4 KB) represented by each PRP entry on the host side. Further, when the logical address LBA corresponding to the DTU is not aligned with 4KB, part of the data in the DTU is not valid data and thus does not need to be transferred to the host, which results in that the data (even if metadata is added) moved to the host according to some DTUs is smaller than the memory block size (e.g., 4 KB) indicated by the PRP entry. Still further, in addition to metadata recorded in the spare data area of the NVM chip, the control component may also generate other metadata (e.g., PI, protection information) such that when the DTU is moved to host memory, the required host memory space is not equal to the storage block size (e.g., 4 KB) indicated by the PRP entry. But at the host side the data is stored in a 4KB aligned manner (each PRP indicates a memory block size of 4KB and its address space is aligned along 4 KB), so that at the host side a memory block indicated by one PRP entry may have data for one or more DTUs.
Since the size of the memory block of the host indicated by the PRP entry is inconsistent with the DTU size, the storage command processing completion message is not in one-to-one correspondence with the PRP entry, and thus when the storage command processing completion message is received during the NVMe read command processing, it needs to be determined whether to delete the corresponding PRP entry.
As shown in fig. 7A, 7B, and 7C, in fig. 7A, 5 parts of data are respectively read from the NVM chip according to the physical addresses corresponding to the logical addresses LBA0 to LBA5 indicated by the NVMe read command, and the data are stored in the DRAM. The memory cell size represented by each logical address is 4KB. The data 1 is the data (4KB+512 Byte) corresponding to the LBA0, since the LBA0 indicated by the NVMe read command is not aligned with the 4KB, the beginning part of the data 1 is not required to be moved to the host, the part to be moved to the host is smaller than the 4KB, the data 2 is the data (4KB+512 Byte) corresponding to the LBA1, the size is equal to the 4KB+512Byte, the data 3 is the data (4KB+512 Byte) corresponding to the LBA2, the size is equal to the 4KB+512 Byte), the size is equal to the 4KB+512Byte, the data 4 is the data (4KB+512 Byte) corresponding to the LBA3, the size is equal to 4KB+512 Byte), the data (4KB+512Byte) corresponding to the LBA4 is the data required to be moved to the host, the size is equal to 4KB+512Byte, and the data (4KB+512 Byte) corresponding to the LBA4 is the data required to be moved to the host is the data (4KB+512 Byte) corresponding to the end part of the data 4, and the data (4KB+512 Byte) corresponding to the end part of the data 4 is required to be moved to the host is not aligned with the host, and the data (4+512 Byte) corresponding to the end part of the data 4+6 and the data is required to be moved to the data is not aligned with the host.
As shown in fig. 7B, PRP0-PRP6 are PRP entries indicated by NVMe read commands, respectively, and the size of the memory block in the host memory represented by each of PRP0-PRP6 is 4KB. Although memory blocks indicated by spatially contiguous PRPs 0-PRP6 are illustrated in fig. 7B, these memory blocks may alternatively be non-contiguous memory blocks in host memory space. But these blocks together form a contiguous virtual memory space in the order PRP0-PRP 6. And ordering respective Data (DTUs) moved to the memory block indicated by the PRP entry according to the NVMe read command in a contiguous virtual memory space by their LBAs, whereby each DTU has a determined location and size in the contiguous virtual memory space. It is thus determined from the LBA and size of the DTU that it has a determined location in virtual memory space and gets to which PRP indicated memory block the location belongs.
As shown in fig. 7C, data is stored starting from the memory block indicated by PRP0, that is, data 1 is stored in the memory block indicated by PRP 0. Since the size of data 1 is smaller than 4KB and the start address of the data to be requested therein is not aligned with 4KB, data 1 is not stored from scratch (but from an intermediate location) in the memory block indicated by PRP0, and after data 1 is stored in the memory block indicated by PRP0, the memory block indicated by PRP0 also stores part of the data in data 2. Storing part of the data 2 into the data block represented by the PRP1, because the size of the data 2 is greater than 4KB, the data block represented by the PRP1 cannot completely store the data 2, and also needs to occupy the storage block represented by the PRP 2. Data 3 is stored in data blocks represented by PRP2 and PRP 3. Data 4 needs to be stored in the data blocks represented by PRP3 and PRP 4. Data 5 needs to be stored in the data blocks represented by PRP5 and PRP 6. Data 6 needs to be stored in the data block represented by PRP 6. In fig. 7C, each piece of data corresponding to an LBA to be stored to the host memory is indicated by a hatched box.
In the case of PRP0, since data 1 and data 2 are stored in the storage block indicated by PRP0, after receiving the message that the processing of the storage command corresponding to data 1 is completed, the entry of PRP0 in the cache unit cannot be deleted yet (since PRP0 is also required to be used when the processing of the storage command corresponding to data 2 is completed is processed next), and PRP0 in the cache unit needs to be deleted after receiving the processing completion message of the storage command corresponding to data 2. Further, the sequence in which the message of the storage command processing completion corresponding to the data 1 and the message of the storage command processing completion corresponding to the data 2 are received is not guaranteed, so that the message of the storage command processing completion corresponding to the data 1 cannot be guaranteed to be received when the message of the storage command processing completion corresponding to the data 2 is received, and thus PRP entries in the cache unit cannot be deleted according to the number or index of the data corresponding to the message of the storage command processing completion, and PRP entries in the cache unit are deleted according to the number of times of the received storage command processing completion message, and the storage command processing completion messages corresponding to the PRP entries may be different. For example, for a PRP0 entry, the store command processing complete message associated with PRP0 needs to be received twice to be deleted from the cache unit. For PRP1, since only data 2 is stored in PRP1, the storage command processing completion message related to PRP1 needs to be received once to be deleted from the cache unit. The release process of PRP2-PRP6 is known in the same manner.
As can be seen in conjunction with fig. 7A, 7B and 7C, in the case where the PRP entries represent 4KB host memory blocks and the store command processing complete message represents 4kb+512B, it is determined that each PRP entry contains data of several DTUs, < PRP0,2> (indicating that PRP0 contains data of 2 DTUs), < PRP1,1>, < PRP2,2>, < PRP3,2>, < PRP4,1>, < PRP5,1>, < PRP6,2>. And according to the number of DTUs corresponding to each PRP item, determining to delete the corresponding PRP item after receiving a plurality of storage command completion messages, and releasing the cache unit.
By way of example, a register is provided in the control section for each PRP entry, and the number of times of receipt of the storage command processing completion message indicating the PRP entry is recorded by the respective register, so that the control section releases the buffer unit according to the number of storage command completion messages corresponding to each PRP entry received. Each PRP item needs to receive several storage command completion messages to be released, and the number of DTUs corresponding to the PRP item is related, for example, the number of DTUs corresponding to the PRP item is 1,2 or 3, when the number of DTUs corresponding to the PRP item is 1, a cache unit storing the PRP item can be released when one storage command completion message corresponding to the PRP item is received, when the number of DTUs corresponding to the PRP item is 2, two storage command completion messages corresponding to the PRP item can be released when two storage command completion messages corresponding to the PRP item are received, and when the number of DTUs corresponding to the PRP item is 3, three storage command completion messages corresponding to the PRP item can be released when three storage command completion messages corresponding to the PRP item are received. In order to record the number of store command completion messages corresponding to PRP entries, each of the registers corresponding to PRP entries contains at least 1 bit.
As can be seen from fig. 7A, 7B, and 7C, in the case where the size of the memory location represented by each logical address is 4KB and the size of the host memory location represented by each PRP entry is also 4KB, the host memory location represented by the PRP entry accommodates at most 2 DTUs of data, and can receive a storage command processing completion message indicating the PRP entry at most twice. As another example, the register corresponding to each PRP entry includes 1 bit, and the number of times the PRP entry indicated by the command processing completion message has been used may be counted according to the value recorded in the 1-bit register. For example, a 1-bit register value of 0 indicates that no store command processing complete message has been received. The 1-bit register has a value of 1, indicating that a store command processing complete message has been received once.
Taking the PRP0 entry shown in fig. 7C as an example, the PRP0 entry accommodates data of 2 DTUs, the expected number of times the PRP0 entry is used is 2, and the number of times the PRP entry has been used is recorded with A1-bit register A1, that is, the number of times a received store command processing completion message indicating the PRP0 entry is recorded with A1-bit register. If a store command processing completion message indicating a PRP0 entry is received (the store command processing completion message indicating a PRP0 entry is denoted as CPL 1), the value of the register A1 is determined, and if the value of the register A1 is 0, it is indicated that the received CPL1 is the first store command processing completion message indicating a PRP0 entry. At this time, the number of times the PRP0 entry has been used is 1, and the number of times the PRP0 entry has been used is not equal to the expected number of times the PRP0 entry has been used, the value of the register A1 is updated to 1, for example, by inverting the value of the register A1, the value of the register A1 is updated from 0 to 1. If the storage command processing completion message indicating the PRP0 entry is received again (the storage command processing completion message indicating the PRP0 entry is denoted as CPL 2), the value of the register A1 is determined, and if the value of the register A1 is 1, it is indicated that the CPL2 is the second received storage command processing completion message indicating the PRP0 entry, the number of times the current PRP0 entry has been used is equal to the expected number of times the PRP0 entry has been used, the PRP0 entry is deleted from the cache unit, and the cache unit occupied by the PRP0 entry is released. Alternatively, in the case of releasing the cache unit occupied by the PRP0 entry, the value of the register A1 may also be cleared, so that the number of times of processing the completion message by the storage command corresponding to the subsequent PRP0 entry is recorded by using the register A1.
Taking the PRP1 entry shown in fig. 7C as an example, the PRP1 entry accommodates data of 1 DTU, the expected number of times the PRP1 entry is used is 1, and the number of times a store command processing completion message indicating the PRP1 entry is recorded with a 1-bit register A2. If a store command processing completion message indicating a PRP1 entry is received (the store command processing completion message indicating a PRP1 entry is denoted as CPL 3), the value of the register A2 is determined, and if the value of the register A2 is 0, it indicates that the CPL3 is the first received store command processing completion message indicating a PRP1 entry, the number of times the PRP1 entry has been used is 1, which is equal to the expected number of times the PRP1 entry has been used, the PRP1 entry is deleted from the cache unit, and the cache unit occupied by the PRP1 entry is released. If the value of the register A2 is 1, it indicates that the CPL3 is a second received storage command processing completion message indicating the PRP1 entry, the number of times the PRP1 entry has been used is 2, which is greater than the expected number of times the PRP1 entry has been used, and the storage command processing completion message fed back by the storage command processing unit has an error and performs error reporting.
Also for example, in the case where the expected number of times the PRP entry indicated by the NVMe read command is used is not greater than 2, for example, in the case where the storage unit represented by each logical address is identical to the host storage unit represented by each PRP entry in size, both are 4KB, a 1-bit register is used to record the number of times the PRP entry has been used. When a storage command processing completion message corresponding to a PRP entry is received, for the PRP entry, determining whether to release the cache unit occupied by the PRP entry by combining the value of the corresponding register and the expected number of times the PRP entry is used is:
(1) If the expected number of times the PRP entry is used is 1, the value of the register is 0, releasing the cache unit occupied by the PRP entry, and emptying the register;
(2) If the expected number of times the PRP entry is used is 1, the value of the register is 1, and an error is reported;
(3) If the expected number of times the PRP entry is used is 2, the value of the register is 0, the value of the register is updated to be 1, and the cache unit occupied by the PRP entry is not released;
(4) If the expected number of times the PRP entry is used is 2, the register value is 1, the cache unit occupied by the PRP entry is released, and the register is emptied.
In an alternative embodiment, referring to fig. 3, the function of releasing the buffer unit may be implemented by the completion message processing circuit in the read initiation circuit, or other circuits may be provided in the read initiation circuit to implement the function of releasing the buffer unit.
FIG. 8 is a schematic diagram of a circuit for releasing a buffer unit according to an embodiment of the application. As shown in fig. 8, the circuit for freeing the cache unit includes a memory M1, a calculation circuit W1, a comparison circuit, and a PRP entry deletion circuit.
The memory M1 is configured to record the number of storage command processing completion messages corresponding to the PRP entry indicated by the NVMe read command, that is, the number of times the PRP entry indicated by the NVMe read command has been used. And adding 1 to the number of times the PRP entry corresponding to the storage command processing completion message has been used and recording the number in a memory every time the storage command processing completion message is received.
The calculation circuit W1 receives the storage command processing completion message, and calculates the expected number of times the PRP entry thereof is used, based on the storage command processing completion message. As shown in fig. 7A and 7C, for example, when the calculation circuit receives a storage command processing completion message corresponding to data 1, where lba=0, the corresponding PRP entry is PRP0, PRP0 accommodates data of 2 DTUs, and PRP0 should be released after receiving 2 storage command processing completion messages corresponding to PRP0, the expected number of times PRP0 is used is 2. For another example, the computing circuit receives a storage command processing completion message corresponding to data 2, with lba=1, and the corresponding PRP entries include PRP0, PRP1, and PRP2.PRP0 accommodates data of 2 DTUs, and should release PRP0 after receiving 2 storage command processing completion messages corresponding to PRP0, the expected number of times PRP0 is used is 2.PRP1 accommodates data of 1 DTU, and should release PRP1 after receiving 1 storage command processing completion message corresponding to PRP1, the expected number of times PRP0 is used is 1.PRP2 accommodates data of 2 DTUs, and should release PRP2 after receiving 2 storage command processing completion messages corresponding to PRP2, the expected number of times PRP2 is used is 2.
The comparator is coupled to the memory M1 and the calculation circuit W1, and is configured to compare whether the number of times the PRP recorded in the memory M1 has been used is the same as the expected number of times calculated by the calculation circuit W1, and if so, to provide the index of the PRP entry to the PRP entry deletion circuit. For example, the comparator compares whether PRP0 has been used a number of times equal to 2, and if so, provides the index of the PRP0 entry to the PRP entry deletion circuit.
The PRP entry deletion circuit deletes the corresponding PRP entry in the cache unit according to the index of the received PRP entry. For example, the PRP entry deletion circuit receives PRP0 provided by the comparator, deletes PRP0 in the cache unit, and releases the cache unit.
Alternatively, where the memory location represented by the logical address and the host memory location represented by each PRP entry are both 4KB in size, the memory M1 may be a 1-bit register. The calculation circuit W1 receives the storage command processing completion message, and calculates the expected number of times the PRP entry corresponding thereto is used, based on the storage command processing completion message. The comparator acquires the expected number of times the PRP entry calculated by the calculation circuit W1 is used and the value of the memory M1. If the expected number of times the PRP entry is used is 1 and the value of the memory M1 is 0, when the calculation circuit W1 receives the storage command processing completion message corresponding to the PRP entry, the comparator supplies the index of the PRP entry to the PRP entry deletion circuit. If the expected number of times the PRP entry is used is 2 and the value of the memory M1 is 1, the current computing circuit W1 receives the storage command processing completion message corresponding to the PRP entry for the second time, and the comparator supplies the index of the PRP entry to the PRP entry deletion circuit.
FIG. 9 is a schematic diagram of another circuit for releasing a cache unit according to an embodiment of the present application. As shown in fig. 9, the circuit of the release buffer unit includes a memory M2, a calculation circuit W2, and a control circuit.
Wherein the memory M2 is used for recording the number of storage command processing completion messages corresponding to the PRP entries indicated by the NVMe read command, i.e. the number of times the PRP entries indicated by the NVMe read command have been used.
The calculation circuit W2 receives the storage command processing completion message, calculates the expected number of times the corresponding one or more PRP entries are used, based on the LBA indicated by the storage command processing completion message, and supplies the expected number of times the PRP entries are used to the control circuit.
The control circuit is coupled to the memory M2 and the calculation circuit W2, receives the expected number of times the PRP entry provided by the calculation circuit W2 is used (as in the process (1) in fig. 9), obtains the number of times the PRP entry is used (as in the process (2) in fig. 9) corresponding to the PRP entry from the memory M2, compares the number of times the PRP entry is used with the expected number of times, and manages the PRP entry in the cache unit according to the comparison result. If the number of times the PRP entry has been used is equal to the expected number of times, the PRP entry in the cache unit is deleted, freeing the cache unit (process (3) in fig. 9). If the number of times the PRP entry has been used is not equal to the expected number of times, the number of times the PRP entry has been used in the control memory M2 is increased by 1 (as in process (4) in fig. 9).
Alternatively, where the memory location represented by the logical address and the host memory location represented by each PRP entry are both 4KB in size, the memory M2 may be a 1-bit register. The calculation circuit W2 receives the storage command processing completion message, and calculates the expected number of times the PRP entry corresponding thereto is used, based on the storage command processing completion message. The control circuit receives the expected number of times the PRP entry provided by the calculation circuit W2 is used, and obtains the value of the memory M2. If the expected number of times the PRP entry is used is 2 and the value of the memory M2 is 1, the calculation circuit W2 receives a storage command processing completion message corresponding to the PRP entry again, and the control circuit deletes the PRP entry in the cache unit and releases the cache unit. If the expected number of times the PRP entry is used is 2 and the value of the memory M2 is 0, the calculation circuit W2 receives the storage command processing completion message corresponding to the PRP entry, and the control circuit performs the inverting operation on the value of the memory M2 and updates the value of the memory M2 to 1. If the expected number of times the PRP entry is used is 1 and the value of the memory M2 is 0, the calculation circuit W2 receives a storage command processing completion message corresponding to the PRP entry, and the control circuit deletes the PRP entry in the cache unit and releases the cache unit. If the expected number of times the PRP entry is used is 1, the value of memory M2 is 1 and the control circuit makes an error.
As can be seen from the above, the size of the memory block indicated by each PRP entry (in the host memory) is not consistent with the size of the memory unit represented by the LBA, for example, the size of the memory block indicated by each PRP entry is 8KB, and the size of the memory unit represented by each LBA is 4KB. The process of freeing the cache unit is described below with reference to the size of the memory block indicated by each PRP entry being 8KB (see fig. 10A), and the size of the memory unit represented by each LBA being 4KB.
The data that the NVMe read command indicates needs to be moved from the storage device is recorded as data 1 to data 6. As shown in fig. 10B, PRP0-PRP3 are PRP entries indicated by NVMe read commands, respectively, and the size of the memory block indicated by each of PRP0-PRP3 is 8KB. As shown in fig. 10B, data is stored starting from the memory block indicated by PRP0, i.e., data 1 is stored in PRP 0. Since data 1 is smaller than 4KB and the start address of the required data is not aligned with 8KB, after data 1 is stored in the memory block indicated by PRP0, the memory block indicated by PRP0 also stores data 2, and since the sum of data 1 and data 2 is smaller than 8KB, the memory block indicated by PRP0 can also store part of the data of data 3. Similarly, the memory block indicated by PRP1 may store the remaining data of data 3, the complete data 4 and the partial data of data 5, and the memory block indicated by PRP2 may store the remaining data of data 5 and the partial data of data 6.
For PRP0, since the storage block indicated by PRP0 stores part of the data of data 1, data 2 and data 3, after receiving the message of the storage command processing completion corresponding to data 1, PRP0 in the cache unit is not deleted, and the operation of deleting PRP0 in the cache unit is initiated only after receiving the storage command processing completion message corresponding to data 2 and data 3. That is, for PRP0, the storage command processing completion message associated with PRP0 needs to be received three times to be deleted from the cache unit. For PRP1, since some of data 3, data 4, and data 5 are stored in PRP1, the storage command processing completion message associated with PRP1 also needs to be received three times to be deleted from the cache unit. For PRP2, since only a part of data 5 and data 6 are stored in PRP2, a storage command processing completion message related to PRP2 needs to be received twice to delete the storage command processing completion message from the cache unit.
Referring to the embodiment shown in fig. 3 to fig. 10B, as shown in fig. 11, a cache unit management method for a storage device according to an embodiment of the present application includes:
Step 1101, in response to receiving the IO command, obtaining address location information indicated by the IO command and adding the address location information to the allocated cache unit. The address location information comprises a PRP entry, an SGL descriptor and a host memory address determined according to the PRP entry or the SGL descriptor. The number of PRP entries, SGL descriptors, host memory addresses determined from the PRP entries or SGL descriptors may be one or more.
Step S1102, if the IO command is an NVMe write command, responding to the address location information indicated by the NVMe write command to generate a DMA command, and releasing the cache unit occupied by the target address location information.
Step S1103, if the IO command is an NVMe read command, releasing a cache unit occupied by the address location information indicated by the storage command processing completion message in response to the fact that the number of times the address location information indicated by the storage command processing completion message is used is equal to the expected number of times, wherein the storage command indicated by the storage command processing completion message is generated according to the NVMe read command.
Step 1104, if the IO command is an NVMe read command, the number of times the address location information indicated by the storage command processing completion message has been used is controlled to be increased by one in response to the number of times the address location information indicated by the storage command processing completion message has been used being unequal to the expected number of times.
In the process of processing the NVMe read command, there may be an error in the storage command processing completion message fed back by the storage command processing unit. The error condition includes, for example, a store command processing unit multi-feeding back a store command processing complete message (e.g., a repeat back) or a store command processing unit few-feeding back or missed-feeding back a store command processing complete message for a PRR entry. For another example, referring to FIG. 7C, for the NVMe read command, the storage command processing unit expects the feedback storage command processing completion message to include a storage command processing completion message indicating the PRP0 entry twice, a storage command processing completion message indicating the PRP1 entry once, a storage command processing completion message indicating the PRP2 entry twice, a storage command processing completion message indicating the PRP3 entry twice, a storage command processing completion message indicating the PRP4 entry once, a storage command processing completion message indicating the PRP5 entry once, and a storage command processing completion message indicating the PRP6 entry twice. However, the actually received storage command processing completion message fed back by the storage command processing unit may lack a storage command processing completion message corresponding to a certain PRP entry, or receive a repeated storage command processing completion message, for example:
(1) The actually received storage command processing completion message comprises a storage command processing completion message indicating a PRP0 item twice, a storage command processing completion message indicating a PRP2 item twice, a storage command processing completion message indicating a PRP3 item twice, a storage command processing completion message indicating a PRP4 item once, a storage command processing completion message indicating a PRP5 item once and a storage command processing completion message indicating a PRP6 item twice, wherein the storage command processing completion message indicating a PRP1 item is absent, and the storage command processing completion message of the PRP1 item is missed to be fed back, so that a cache unit occupied by the PRP1 item cannot be released;
(2) It is expected that the storage command processing completion message indicating the PRP0 entry is received twice (logical addresses corresponding to the storage command processing completion message are LBA0 and LBA1, respectively), but the storage command processing completion message indicating the PRP0 entry is actually received only once (logical address corresponding to the storage command processing completion message is actually received is LBA 0), the storage command processing completion message of the PRP0 entry is fedback less, resulting in that the buffer memory unit occupied by the PRP0 entry cannot be released;
(3) It is expected that the storage command processing completion message indicating the PRP0 entry is received twice (logical addresses corresponding to the storage command processing completion message are LBA0 and LBA1, respectively), and the storage command processing completion message indicating the PRP0 entry is actually received twice, but the logical addresses corresponding to the received storage command processing completion message are both LBA1, so that the storage command processing completion message corresponding to LBA1 is repeatedly fed back, the storage command processing completion message corresponding to LBA0 is not fed back, the cache occupied by the PRP0 entry is released, and the storage command corresponding to LBA0 cannot be processed;
(4) It is expected that the storage command processing completion message indicating the PRP1 entry is received once, but the storage command processing completion message indicating the PRP1 entry is received twice, so that the storage command processing completion message of the PRP1 entry is fed back repeatedly, so that an error exists in the storage command processing completion message, and an error will be reported.
It is understood that there may be an error in which the storage command processing completion message is received less in processing the NVMe read command, or an error in which the storage command processing completion message is received repeatedly. In order to identify whether the NVMe read command is properly processed, it is necessary to monitor whether the received storage command processing completion message has an error.
As an example, the storage command processing completion message indicating a certain PRP entry is detected by the number of times of the storage command processing completion message corresponding to the received PRP entry for repeated reception. For example, if a PRP entry has been used more than the expected number of times it is used, it is determined that a store command processing complete message indicating the PRP entry is repeatedly received. For example, for the PRP1 entry in fig. 7C, which is used an expected number of times of 1, but in fact a store command processing completion message indicating the PRP1 entry is received twice (i.e., the number of times the PRP1 entry has been used is 2), it is determined that the store command processing completion message indicating the PRP1 entry replies repeatedly, and there is an error in the store command processing completion message.
For another example, for the PRP0 entry, the buffer unit occupied by the PRP0 entry needs to be released to receive the two storage command processing completion messages, if the two storage command processing completion messages are received, the actual number of times of the storage command processing completion messages is equal to the expected number of times, and the buffer unit occupied by the PRP0 entry is released, but the two received storage command processing completion messages both correspond to LBA1, and the storage command processing completion message corresponding to LBA0 is not received, which may result in that the buffer unit occupied by the PRP0 entry is released, so that the storage command corresponding to LBA0 cannot be processed, and an erroneous storage command processing completion message is determined according to the situation that the storage command cannot be processed.
In an alternative embodiment, the monitoring of the store command processing complete message is performed in conjunction with the release of the cache element. For example, a corresponding counter is provided for each PRP entry, the counter being used to record the number of times a store command processing complete message indicating the PRP entry is received, the value of the counter being represented by count. As another example, as shown in fig. 8 or 9, the number of times a received store command processing completion message indicating each PRP entry is recorded by providing a memory, where count represents the value of the memory. In the case where a store command processing completion message, such as CPL1, is received, the expected number of times that the PRP entry indicated by the store command processing completion message is used is determined from the store command processing completion message, and the expected number of times that the PRP entry is used is represented by Pcount. The count value in the counter is then obtained (count record of the number of times the store command processing is completed message does not include CPL 1), and Pcoun is compared with count:
if Pcoun > count+1, then the buffer memory unit occupied by the PRP entry is not released, and the number of times the PRP entry recorded by the counter has been used is increased by 1;
If Pcoun = count+1, then the cache unit occupied by the PRP entry is released;
if Pcoun < count+1, it is determined that there is an error in the storage command processing completion message, and a repeated storage command processing completion message is received.
As an alternative example, as shown in fig. 9, the number of times of the received storage command processing completion message is recorded by the memory M2. In the embodiment shown in fig. 9, the calculation circuit W2 calculates the expected number of times the PRP entry, such as PRP entry 3, indicated by the store command processing completion message is used, in response to receiving the store command processing completion message, such as CPL2, represents the expected number of times the PRP entry is used with Pcount3, and supplies the calculated Pcount3 to the control circuit. The control circuit acquires the number of times of the storage command processing completion message corresponding to the PRP entry 3 from the memory M2, and uses count3 to represent the number of times of the storage command processing completion message corresponding to the PRP entry 3. The control circuit compares Pcount3 with count3, if Pcoun < 3> (count 3) +1, the buffer memory unit occupied by PRP entry 3 is not released, the number of times of processing the completion message of the storage command corresponding to PRP entry 3 recorded by the memory M2 is controlled to be increased by 1, if Pcoun3 = (count 3) +1, the buffer memory unit occupied by PRP entry 3 is released by the control circuit, if Pcoun < (count 3) +1, the control circuit determines that the storage command processing completion message has an error, and the repeated storage command processing completion message is received.
For the case that the storage command processing completion message is not received, according to the release condition of the buffer memory unit occupied by the PRP entry, it cannot be determined whether the storage command corresponding to the PRP entry is not processed or the storage command corresponding to the PRP entry is processed, but the correct storage command processing completion message is not generated. Thus, in an alternative embodiment, a timestamp representing the receipt of an NVMe read command may be recorded for each NVMe read command, e.g., the time when the NVMe read command was added to the information cache (as shown in fig. 3) is recorded in the information cache. If the NVMe read command indicates a plurality of PRP entries, the timestamp corresponding to each PRP entry is the same. For each PRP item, calculating the difference between the current time and the corresponding timestamp to obtain the corresponding delay time of the PRP item, and determining that a storage command processing completion message indicating the PRP item is not received (the storage command processing completion message of the PRP item is less returned) and the storage command processing completion message has an error when the delay time is greater than a specified value and the occupied cache unit is not released yet.
In an alternative embodiment, if a plurality of storage command processing completion messages are received, time stamps of all storage command processing completion messages in the received plurality of storage command processing completion messages are sequentially recorded, a difference value between a current time and the time stamp corresponding to each storage command processing completion message is calculated, the difference value is used as a delay time corresponding to the storage command processing completion message, a processing sequence of the plurality of storage command processing completion messages is determined according to the delay time corresponding to each storage command processing completion message, for example, the storage command processing completion message with the longest delay time is preferentially processed according to the sequence of the delay times from large to small, that is, whether to release a buffer unit occupied by a PRP entry indicated by the storage command processing completion message with the longest delay time is preferentially determined.
Alternatively, a plurality of registers may be provided in the control section, recording the maximum value and the minimum value of the delay time corresponding to each storage command processing completion message, and the number of storage command processing completion messages within each delay time interval between the maximum value and the minimum value. Wherein the boundary value of each delay time interval is configurable. The values of the various registers may be read by Firmware (Firmware). When a plurality of storage command processing completion messages are received, each storage command processing completion message is processed according to the delay time, for example, the storage command processing completion message with the longest delay time is preferentially processed, and the buffer memory unit occupied by the PRP entry indicated by the storage command processing completion message with the longest delay time is preferentially released. Optionally, if the delay time corresponding to the storage command processing completion message is greater than a preset threshold, that is, the difference between the current time and the timestamp corresponding to the storage command processing completion message is greater than the preset threshold, storing the relevant information of the storage command corresponding to the storage command processing completion message in metadata of the storage device. And the host reads the storage command related information stored in the metadata of the storage device according to the logic address indicated by the custom command which meets the NVMe protocol. For example, 3 storage command processing completion messages are received, namely storage command processing completion message 1, storage command processing completion message 2, storage command processing completion message 3. The time stamp corresponding to the storage command processing completion message 1 is t1, the time stamp corresponding to the storage command processing completion message 2 is t2, and the time stamp corresponding to the storage command processing completion message 3 is t3. The current time is t4. If (t 4-t 3) > (t 4-t 2) > (t 4-t 1), the storage command processing completion message 3 is preferentially processed, the storage command processing completion message 2 is processed, and the storage command processing completion message 1 is processed. If (T4-T3) > the specified threshold Tm, storing the relevant information of the storage command corresponding to the storage command processing completion message 3 in metadata of the storage device. And in response to receiving a custom command meeting the NVMe protocol sent by the host, reading the related information of the storage command stored in the metadata according to the logic address indicated by the custom command.
In an alternative embodiment, in response to processing the storage command processing completion message, a DMA command is generated (e.g. procedure (2) in FIG. 3) from the storage command processing completion message and data is moved from the storage device cache to the host in accordance with the DMA command.
Optionally, it is detected whether there is a case of receiving less a store command processing completion message in the course of releasing the cache unit. For example, as shown in fig. 12, a time stamp representing the receipt of the NVMe read command is recorded in the information buffer shown in fig. 3, and the information buffer shown in fig. 3 is coupled with the control circuit shown in fig. 9. The control circuit obtains the time stamp of the NVMe read command recorded in the information cache, and determines the time stamp of the PRP entry corresponding to the NVMe read command, for example, the PRP entry 4. The control circuit calculates the difference between the current time and the timestamp of the PRP entry 4 to obtain the delay time corresponding to the PRP4 entry, and if the delay time corresponding to the PRP4 entry is greater than or equal to a specified value and the buffer memory unit occupied by the PRP4 entry is not released, the processing completion message of the storage command corresponding to the PRP4 entry is determined to be received less. Still alternatively, a time stamp representing the receipt of the NVMe read command may be recorded in the information buffer shown in fig. 3 and synchronized to the memory M2 shown in fig. 9. That is, in the memory M2 shown in fig. 9, not only the number of times of storing the command processing completion message corresponding to the PRP entry but also the time stamp corresponding to the PRP entry may be recorded, and the control circuit acquires the time stamp corresponding to each PRP entry from the memory M2.
Alternatively, the embodiment of the present application provides a block diagram of another circuit for releasing the cache unit as shown in FIG. 13. As shown in fig. 13, the circuit of the release buffer unit includes a calculation circuit W2, a calculation circuit W3, a memory M2, a memory M3, and a control circuit. The functions implemented by the computing circuit W2 and the memory M2 are the same as those of the computing circuit W2 and the memory M2 shown in fig. 9, and will not be described here again. The memory M3 records a time stamp corresponding to the address location information, and the calculation circuit W3 calculates a difference between the current time and the time stamp corresponding to the address location information to determine a delay time of the address location information. And when the delay time is greater than or equal to a specified value, the address positioning information is provided for the control circuit, the control circuit determines whether a cache unit occupied by the address positioning information is released, and if not, the control circuit determines that the storage command processing completion message has an error.
From the perspective of the PRP entry above, it is determined whether there is an erroneous store command processing completion message according to the expected number of times the PRP entry has been used and the number of times it has been used, and the delay time corresponding to the PRP entry. The process of determining whether there is an erroneous storage command processing completion message is described in the following from the viewpoint of an NVMe read command.
Optionally, determining whether an error exists in the storage command processing completion message corresponding to the NVMe read command according to the number of storage command processing completion messages expected to be received by the NVMe read command and the number of storage command processing completion messages received. For example, NVMe read command 1 expects that the received storage command processing completion message includes a storage command processing completion message corresponding to LBA0, a storage command processing completion message corresponding to LBA1, a storage command processing completion message corresponding to LBA2, a storage command processing completion message corresponding to LBA3, a storage command processing completion message corresponding to LBA4, and a storage command processing completion message corresponding to LBA 5. For NVMe read command 1, 6 storage command processing completion messages are expected to be received.
In some alternative embodiments, if the actually received storage command processing completion message includes a storage command processing completion message corresponding to LBA0, two storage command processing completion messages corresponding to LBA1, a storage command processing completion message corresponding to LBA2, a storage command processing completion message corresponding to LBA3, a storage command processing completion message corresponding to LBA4, and a storage command processing completion message corresponding to LBA 5. For NVMe read command 1, 7 storage command processing completion messages are actually received. The number of actually received storage command processing completion messages is larger than the number of expected received storage command processing completion messages, and it is determined that an error exists in the storage command processing completion message corresponding to the NVMe read command.
In alternative embodiments, if the actually received storage command processing completion message includes a storage command processing completion message corresponding to LBA0, two storage command processing completion messages corresponding to LBA1, a storage command processing completion message corresponding to LBA2, a storage command processing completion message corresponding to LBA3, and a storage command processing completion message corresponding to LBA 4. For NVMe read command 1, 6 storage command processing completion messages are actually received. The number of actually received storage command processing completion messages is equal to the number of storage command processing completion messages expected to be received, but one storage command processing completion message corresponding to LBA1 is expected to be received, two storage command processing completion messages corresponding to LBA1 are expected to be received, one storage command processing completion message indicating LBA5 is expected to be received, and no storage command processing completion message indicating LBA5 is actually received. The logical address indicated by the received storage command processing completion message is expected to be different from the logical address indicated by the actually received storage command processing completion message, and thus it is determined that there is an error in the storage command processing completion message corresponding to the NVMe read command.
Still alternatively, whether there is an erroneous storage command processing completion message is determined according to a delay time corresponding to the NVMe read command. If the delay time corresponding to the NVMe read command is greater than or equal to the appointed value and the buffer unit occupied by the PRP entry corresponding to the NVMe read command is not completely released, determining that an error storage command processing completion message exists. For example, the PRP entries corresponding to NVMe read command 1 include PRP0 entry, PRP1 entry, PRP2 entry, PRP3 entry, PRP4 entry, PRP5 entry, and PRP6 entry. The delay time corresponding to the current NVMe read command 1 is 1 second, the appointed value is 45 milliseconds, the delay time corresponding to the NVMe read command 1 is larger than the appointed value, and the buffer memory unit occupied by the PRP5 item is not released yet, so that the error storage command processing completion message is determined. Wherein, when the NVMe read command 1 is received, the NVMe read command 1 is added to the information buffer as shown in fig. 3, a time stamp representing the received NVMe read command 1 is recorded. And obtaining the delay time corresponding to the NVMe read command 1 according to the difference value between the current time and the time stamp.
Referring to the embodiment shown in fig. 3 to 13, as shown in fig. 14, a cache unit management method of a storage device according to an embodiment of the present application includes:
Step S1401, in response to receiving the IO command, obtaining address location information indicated by the IO command and adding the address location information to the allocated cache unit. The address location information comprises a PRP entry, an SGL descriptor and a host memory address determined according to the PRP entry or the SGL descriptor. The number of PRP entries, SGL descriptors, host memory addresses determined from the PRP entries or SGL descriptors may be one or more.
Step S1402, if the IO command is an NVMe write command, a DMA command is generated in response to the address location information indicated by the NVMe write command, and the cache unit occupied by the target address location information is released.
Step S1403, if the IO command is an NVMe read command, responding to the received storage command processing completion message, and if the number of times that the address positioning information indicated by the storage command processing completion message is used is equal to the expected number of times, releasing a cache unit occupied by the address positioning information indicated by the storage command processing completion message, wherein the storage command indicated by the storage command processing completion message is generated according to the NVMe read command.
Step S1404, if the IO command is an NVMe read command, responding to the received storage command processing completion message, and if the number of times that the address location information indicated by the storage command processing completion message has been used is smaller than the expected number of times, not releasing the buffer unit occupied by the address location information indicated by the storage command processing completion message, and controlling the number of times that the address location information indicated by the storage command processing completion message has been used to be increased by 1.
In step S1405, if the IO command is an NVMe read command, the storage command processing completion message is received in response to the received storage command processing completion message, and if the number of times the address location information indicated by the storage command processing completion message has been used is greater than the expected number of times, it is determined that an erroneous storage command processing completion message exists.
If the IO command is an NVMe read command, in response to the delay time corresponding to the NVMe read command being greater than or equal to a specified value, and the cache unit occupied by the address location information indicated by the NVMe read command not being fully released, determining that an error storage command processing completion message exists, wherein the delay time corresponding to the NVMe read command is equal to the difference between the current time and the timestamp of the NVMe read command.
According to the cache unit management method of the storage device, address positioning information indicated by the IO command in the cache unit can be timely released, the cache unit is provided for the address positioning information indicated by other IO commands, the cache utilization rate is improved, and meanwhile, whether an error exists in a storage command processing completion message in the process of releasing the cache unit can be detected.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application. It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

CN202311255502.4A2023-09-262023-09-26 Cache unit management method and storage devicePendingCN119718727A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202311255502.4ACN119718727A (en)2023-09-262023-09-26 Cache unit management method and storage device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202311255502.4ACN119718727A (en)2023-09-262023-09-26 Cache unit management method and storage device

Publications (1)

Publication NumberPublication Date
CN119718727Atrue CN119718727A (en)2025-03-28

Family

ID=95102434

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202311255502.4APendingCN119718727A (en)2023-09-262023-09-26 Cache unit management method and storage device

Country Status (1)

CountryLink
CN (1)CN119718727A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120068752A (en)*2025-04-282025-05-30山东云海国创云计算装备产业创新中心有限公司Simulation excitation generation method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120068752A (en)*2025-04-282025-05-30山东云海国创云计算装备产业创新中心有限公司Simulation excitation generation method and device, electronic equipment and storage medium

Similar Documents

PublicationPublication DateTitle
CN108733322B (en) Methods for multi-stream garbage collection
CN113760786B (en) Data organization of page stripes and method and device for writing data into page stripes
CN109815157B (en)Programming command processing method and device
CN119718727A (en) Cache unit management method and storage device
CN111290975B (en) Method and storage device for processing read commands and pre-read commands using unified cache
CN111290974B (en)Cache elimination method for storage device and storage device
CN110968527B (en)FTL provided caching
CN112148626A (en)Storage method and storage device for compressed data
CN119512984A (en) Method for efficiently utilizing a cache unit for caching PRP in a storage device
CN112578993B (en)Method and memory device for processing programming errors of multi-plane NVM
CN110865945B (en)Extended address space for memory devices
CN111258491B (en) Method and apparatus for reducing read command processing delay
CN110532199B (en)Pre-reading method and memory controller thereof
CN109840219B (en)Address translation system and method for mass solid state storage device
CN109960667B (en)Address translation method and device for large-capacity solid-state storage device
CN113434082A (en)Media interface controller and storage controller for read command fusion
CN114968849B (en)Method and equipment for improving utilization rate of programming cache
CN110968525B (en)FTL provided cache, optimization method and storage device thereof
CN110580128B (en)Guiding data pre-reading using cached feedback information
CN115048320A (en)VTC accelerator and method for calculating VTC
CN119718169A (en) Method and control component for processing NVMe write commands
CN112148645A (en)De-allocation command processing method and storage device thereof
CN119229913A (en) NVMe controller&#39;s NVMe read IO initiator circuit
CN110928482A (en) Partial page striping and storage device using partial page striping and method therefor
CN111159065B (en)Hardware cache management unit with key (BMU)

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp