Movatterモバイル変換


[0]ホーム

URL:


CN119229912A - NVMe write IO initiator circuit of NVMe controller - Google Patents

NVMe write IO initiator circuit of NVMe controller
Download PDF

Info

Publication number
CN119229912A
CN119229912ACN202310788448.3ACN202310788448ACN119229912ACN 119229912 ACN119229912 ACN 119229912ACN 202310788448 ACN202310788448 ACN 202310788448ACN 119229912 ACN119229912 ACN 119229912A
Authority
CN
China
Prior art keywords
circuit
dma
command
write
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310788448.3A
Other languages
Chinese (zh)
Inventor
王玉巧
兰彤
黄好城
刘传杰
李正审
蔡德卿
肖峰
聂鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Starblaze Technology Co ltd
Original Assignee
Beijing Starblaze Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Starblaze Technology Co ltdfiledCriticalBeijing Starblaze Technology Co ltd
Priority to CN202310788448.3ApriorityCriticalpatent/CN119229912A/en
Publication of CN119229912ApublicationCriticalpatent/CN119229912A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请实施例提供了NVMe控制器的NVMe写IO发起电路,涉及存储技术领域。该写入发起电路包括DMA命令生成电路,其与DMA传输电路耦合,响应于接收到寻址信息,基于寻址信息,生成指示从主机内存到存储设备缓存搬移数据的DMA命令,将DMA命令提供给DMA传输电路;存储命令生成电路,其与DMA传输电路、存储命令处理单元耦合,响应于接收到DMA传输电路发送的指示DMA命令处理完成的第一消息,生成指示从存储设备缓存到存储设备闪存搬移数据的存储命令,将存储命令提供给存储命令处理单元。该写入发起电路为独立于CPU的硬件电路,将生成DMA命令的任务从CPU卸载,减轻了CPU的负担。

An embodiment of the present application provides an NVMe write IO initiation circuit of an NVMe controller, and relates to the field of storage technology. The write initiation circuit includes a DMA command generation circuit, which is coupled with a DMA transmission circuit, and in response to receiving addressing information, generates a DMA command indicating the movement of data from a host memory to a storage device cache based on the addressing information, and provides the DMA command to the DMA transmission circuit; a storage command generation circuit, which is coupled with the DMA transmission circuit and a storage command processing unit, and in response to receiving a first message sent by the DMA transmission circuit indicating that the DMA command processing is completed, generates a storage command indicating the movement of data from a storage device cache to a storage device flash memory, and provides the storage command to the storage command processing unit. The write initiation circuit is a hardware circuit independent of the CPU, which unloads the task of generating DMA commands from the CPU, thereby reducing the burden on the CPU.

Description

NVMe write IO initiating circuit of NVMe controller
Technical Field
The application relates to the technical field of storage, in particular to an NVMe write IO initiating circuit of an NVMe controller.
Background
FIG. 1A illustrates a block diagram of a solid state storage device. The solid state storage device 102 is coupled to a host for providing storage capability for the host. The host and solid state storage device 102 may be coupled by a variety of means including, but not limited to, connecting the host to the solid state storage device 102 via, for example, SATA (SERIAL ADVANCED Technology Attachment ), SCSI (Small Computer system interface), SAS (SERIAL ATTACHEDSCSI ), IDE (INTEGRATED DRIVE Electronics, integrated drive Electronics), USB (Universal Serial Bus ), PCIE (PERIPHERAL COMPONENT InterconnectExpress, PCIe, high speed peripheral component interconnect), NVMe (NVM Express, high speed nonvolatile storage), ethernet, fibre channel, wireless communication network, and the like. The host may be an information processing device capable of communicating with the storage device in the manner described above, such as a personal computer, tablet, server, portable computer, network switch, router, cellular telephone, personal digital assistant, or the like. The memory device 102 (hereinafter, solid-state memory device will be simply referred to as memory device) includes an interface 103, a control unit 104, one or more NVM chips 105, and a DRAM (Dynamic RandomAccess Memory ) 110.
The NVM chip 105 described above includes NAND flash memory, phase change memory, feRAM (Ferroelectric RAM, ferroelectric memory), MRAM (Magnetic Random Access Memory, magnetoresistive memory), RRAM (RESISTIVE RANDOMACCESS MEMORY, resistive memory), and the like, which are common storage mediums.
The interface 103 may be adapted to exchange data with a host by means of, for example SATA, IDE, USB, PCIE, NVMe, SAS, ethernet, fibre channel, etc.
The control unit 104 is used for controlling data transmission among the interface 103, the NVM chip 105 and the DRAM110, and also for memory management, host logical address to flash physical address mapping, erase balancing, bad block management, etc. The control component 104 can be implemented in a variety of ways, such as software, hardware, firmware, or a combination thereof, for example, the control component 104 can be in the form of an FPGA (Field-programmable gate array) GATE ARRAY, an ASIC (Application SpecificIntegrated Circuit, application-specific integrated circuit), or a combination thereof. The control component 104 may also include a processor or controller in which software is executed to manipulate the hardware of the control component 104 to process IO (Input/Output) commands. Control unit 104 may also be coupled to DRAM110 and may access data of DRAM 110. FTL tables and/or cached data of IO commands may be stored in the DRAM.
The control section 104 issues a command to the NVM chip 105 in a manner conforming to the interface protocol of the NVM chip 105 to operate the NVM chip 105, and receives a command execution result output from the NVM chip 105. Known NVM chip interface protocols include "Toggle", "ONFI", and the like.
The memory Target (Target) is one or more logical units (LUNs, logic UNit) of a shared CE (Chip Enable) signal within the NAND flash package. One or more dies (Die) may be included within the NAND flash package. Typically, the logic unit corresponds to a single die. The logic cell may include multiple planes (planes). Multiple planes within a logic unit may be accessed in parallel, while multiple logic units within a NAND flash memory chip may execute commands and report status independently of each other.
Data is typically stored and read on a storage medium on a page basis. While data is erased in blocks. A block (also called a physical block) contains a plurality of pages. A block contains a plurality of pages. Pages on a storage medium (called physical pages) have a fixed size, e.g., 17664 bytes. The physical pages may also have other sizes.
FTL (Flash Translation Layer ) is utilized in the storage device 102 to maintain mapping information from logical addresses (LBAs) to physical addresses. The logical addresses constitute the storage space of the solid state storage device as perceived by upper level software such as the operating system. The physical address is an address for accessing a physical storage unit of the solid state storage device. Address mapping may also be implemented in the related art using an intermediate address modality. For example, logical addresses are mapped to intermediate addresses, which in turn are further mapped to physical addresses. The table structure storing mapping information from logical addresses to physical addresses is called FTL table. FTL tables are important metadata in a storage device. The data items of the FTL table record address mapping relations in units of data units in the storage device.
The host accesses the storage device in IO commands that follow the storage protocol. The control component generates one or more media interface commands based on the IO commands from the host and provides the media interface commands to the media interface controller. The media interface controller generates storage media access commands (e.g., program commands, read commands, erase commands) that follow the interface protocol of the NVM chip according to the media interface commands. The control unit also keeps track of all media interface commands generated from one IO command being executed and indicates to the host the result of processing the IO command.
Referring to fig. 1B, the control part includes a host interface 1041, a host command processing unit 1042, a storage command processing unit 1043, a medium interface controller 1044, and a storage medium management unit 1045. The host interface 1041 acquires an IO command provided by the host. The host command processing unit 1042 generates a storage command from the IO command and supplies the storage command to the storage command processing unit 1043. The store commands may access the same size of memory space, e.g., 4KB. The data unit of the data accessed by the corresponding one of the storage commands recorded in the NVM chip is referred to as a data frame. The physical page records one or more frames of data. For example, if the physical page size is 17664 bytes and the data frame size is 4KB, then one physical page can store 4 data frames.
The storage medium management unit 1045 maintains a logical address to physical address conversion for each storage command. For example, the storage medium management unit 1045 includes FTL tables (FTL will be explained later). For a read command, the storage medium management unit 1045 outputs a physical address corresponding to a logical address (LBA) accessed by the storage command. For a write command, the storage medium management unit 1045 allocates an available physical address thereto, and records a mapping relationship of a logical address (LBA) to which it accesses and the allocated physical address. The storage medium management unit 1045 also maintains functions required to manage NVM chips, such as garbage collection, wear leveling, and the like.
The storage command processing unit 1043 operates the medium interface controller 1044 to issue a storage medium access command to the NVM chip 105 according to the physical address supplied from the storage medium management unit 1045.
For purposes of clarity, commands sent by the host to the storage device 102 are referred to as IO commands (including, for example, NVMe read commands, NVMe write commands), commands sent by the host command processing unit 1042 to the storage command processing unit 1043 are referred to as storage commands, commands sent by the storage command processing unit 1043 to the media interface controller 1044 are referred to as media interface commands, and commands sent by the media interface controller 1044 to the NVM chip 105 are referred to as storage media access commands. The storage medium access command follows the interface protocol of the NVM chip.
In the NVMe protocol, after receiving the NVMe write command, the solid-state storage device 102 obtains data from the memory of the host through the host interface 1041, and then writes the data into the flash memory. For NVMe read commands, after data is read from flash memory, the solid state storage device 102 moves the data into host memory through the host interface 1041.
A basic constitution of a host command processing unit 1042 of the related art is shown in fig. 1C. A host command processing unit as shown in fig. 1C is provided in chinese patent 202110746142.2. As shown in fig. 1C, the host command processing unit 1042 includes a shared memory (e.g., located inside the control unit, as opposed to the storage device memory DRAM and NVM chips), a cache unit, an SGL/PRP unit, and a write control circuit. The write-in control circuit comprises a write-in initiating circuit and a DMA transmission circuit, and the write-in initiating circuit and the DMA transmission circuit are matched with each other to realize the process of data movement. The process of processing the IO command, such as an NVMe write command, by the host command processing unit 1042 includes:
The SGL/PRP unit is used for responding to the received NVMe write command, acquiring SGL/PRP corresponding to the write command, storing the SGL/PRP in the cache unit, generating one or more DMA command groups according to the SGL/PRP, wherein each DMA command group comprises one or more DMA commands, and storing the DMA command groups in the shared memory. After the DMA command group is generated, the SGL/PRP unit notifies the write initiator circuit to transfer a DMA command group index, e.g., a DMA command group pointer, indicating the location of the DMA command group in the shared memory to the write initiator circuit. The write-in initiating circuit receives the DMA command group index and transmits the DMA command group index to the DMA transmission circuit, the DMA transmission circuit acquires the DMA command from the shared memory according to the DMA command group index, and the DMA executes the DMA command to move data from the host memory to the storage device memory. When the data transfer indicated by one DMA command is ended, and when the data transfer indicated by one DMA command group is ended, the DMA transfer circuit generates a notification of the end of the data transfer and transmits the notification of the end of the data transfer to the write initiator circuit.
Among them, in the NVMe protocol, there are two commands, an Admin command for a host to manage and control a storage device, and an IO command, including an NVMe write command and an NVMe read command, which are used to control data transmission between the host and the storage device. The fields in the IO command related to SGL or PRP indicate the location of the data in host memory (for write commands) or the host memory address that needs to be written (for read commands). The PRP field or SGL field in the IO command may be an SGL or PRP entry pointing to the host memory address space to be accessed, or may be a pointer to an SGL or PRP linked list, or may even be a pointer to a pointer. If the IO command carries the SGL or the PRP, the storage device can directly acquire the SGL or the PRP in response to receiving the IO command. If the IO command carries an SGL or PRP pointer, in response to receiving the IO command, the storage device accesses the host according to the SGL or PRP pointer, and acquires the SGL or PRP from the host.
PRP (PhysicalRegion Page, physical area page) and SGL (Scatter gather list) are two ways of describing data to be transferred between a host and a storage device. In the PRP approach, several PRP entries are used that are linked together, each PRP entry including a 64-bit memory physical address describing a physical Page (Page) of host memory. The SGL is a linked list of one or more SGL segments, each SGL segment in turn being composed of one or more SGL descriptors, each SGL descriptor describing the address and length of the data cache of the host memory, i.e., each SGL descriptor corresponds to a host memory address space, each SGL descriptor having, for example, a fixed size (e.g., 16 bytes).
The DMA (Direct Memory Access ) command indicates a mapping of host memory address space to storage device memory address space (e.g., DRAM address) for controlling the DMA transfer circuit to perform a data transfer. The host memory address indicated by the DMA command is determined from the address space indicated by the SGL or PRP and the storage device memory address is allocated by the storage device. Each DMA command is used to perform a data transfer operation between the host and the storage device.
Disclosure of Invention
In high performance memory devices, the control unit may process hundreds or thousands of IO commands per second, each of which in turn requires initiation of tens of DMA transfers. The write initiation circuitry (see also fig. 1C) thus requires extremely high processing power to match the performance of the memory device and is prone to becoming a performance bottleneck. If the function of the write initiator is implemented by the execution of software by the CPU, the performance is limited by the processing speed of the CPU, and at this time, the processing capability of the CPU becomes a bottleneck for the performance of processing the IO command. However, the cost of improving the performance of the CPU is high, and it is desirable to use dedicated hardware circuits to accomplish the relevant tasks, thereby improving the processing performance of the IO commands. Meanwhile, related tasks are unloaded from the CPU, so that the load of the CPU is reduced, and the CPU is convenient to process other tasks.
In view of this, an embodiment of the present application provides a write initiator circuit and a storage device, where the write initiator circuit is a hardware circuit independent of a CPU, and generates a DMA command for instructing to move data from a host memory to a storage device memory through the hardware circuit, and generates a storage command for instructing to move data from a storage device memory to a storage device flash memory in response to completion of processing of the DMA command, and offloads tasks for generating the DMA command and the storage command from the CPU, thereby improving processing efficiency of the IO command, reducing load of the CPU, and facilitating the CPU to process other tasks.
According to a first aspect of the present application, there is provided a write initiator circuit comprising a DMA command generation circuit and a memory command generation circuit;
The DMA command generating circuit is coupled with the DMA transmission circuit, responds to receiving addressing information, generates a DMA command based on the addressing information and provides the DMA command to the DMA transmission circuit, wherein the DMA command is used for indicating the DMA transmission circuit to execute data movement from a host memory to a storage device cache according to the DMA command;
The storage command generating circuit is coupled with the DMA transmission circuit, and is used for generating a storage command and providing the storage command to the storage command processing unit in response to receiving a first message sent by the DMA transmission circuit, wherein the first message is used for indicating that the DMA command processing executed by the DMA transmission circuit is completed.
In an alternative embodiment of the present application, the DMA command generation circuit allocates a buffer of the storage device for the data to be moved indicated by the DMA command.
In an alternative embodiment of the application, the operation of the DMA command generation circuit is performed in parallel with the operation of the memory command generation circuit and independently of each other.
In an alternative embodiment of the present application, the DMA command generation circuit is further coupled to a cache unit for caching the addressing information, the addressing information comprising one or more of a PRP entry, an SGL entry, a host memory address determined from the PRP index or the SGL index.
In an alternative embodiment of the application, the DMA command generation circuit is further coupled with the memory command generation circuit, the DMA command generation circuit providing the DMA command to the memory command generation circuit.
In an alternative embodiment of the present application, the write initiation circuit further includes a PRP entry or SGL entry retrieval circuit, the PRP entry or SGL entry retrieval circuit being coupled with the DMA command generation circuit;
The PRP entry or SGL entry retrieval circuit retrieves a PRP entry or SGL entry based on the addressing information in response to receiving the addressing information and provides the PRP entry or SGL entry to the DMA command generation circuit.
In an alternative embodiment of the present application, the write initiator circuit includes one or more circuit branches and a DMA command buffer, which are arranged in parallel, and each circuit branch includes a DMA command generating circuit and a flow control circuit;
The flow control circuit of each circuit branch is coupled with the DMA command generating circuit of the circuit branch where the flow control circuit is located, and is also coupled with the DMA command cache, and is used for controlling whether to provide the DMA command generated by the DMA command generating circuit of the circuit branch where the flow control circuit is located for the DMA command cache;
the DMA command buffer is also coupled to the DMA transfer circuit for buffering received DMA commands and providing buffered DMA commands to the DMA transfer circuit.
In an alternative embodiment of the present application, the flow control circuit of each circuit branch is further coupled to a quota manager, and the flow control circuit of each circuit branch controls whether to provide the DMA command generated by the DMA command generating circuit of the circuit branch where the flow control circuit of each circuit branch is located to the DMA command cache according to the flow control policy configured by the quota manager.
In an alternative embodiment of the present application, each of the circuit branches controls whether to provide the DMA command generated by the DMA command generating circuit of the circuit branch where it is located to the DMA command buffer according to the configured priority.
In an alternative embodiment of the application, the plurality of circuit branches arranged in parallel operate independently of each other.
In an alternative embodiment of the present application, if there are multiple processed NVMe write commands at the same time, each of the NVMe write commands indicates multiple pieces of the addressing information, and for each circuit branch, the circuit branch receives multiple pieces of the addressing information indicated by the same NVMe write command.
In an alternative embodiment of the present application, if namespaces indicated by the plurality of processed NVMe write commands are the same, the same circuit branch receives addressing information indicated by the plurality of processed NVMe write commands, and if namespaces indicated by the plurality of processed NVMe write commands are different, different circuit branches receive addressing information indicated by the plurality of processed NVMe write commands.
In an alternative embodiment of the present application, the circuit branch receiving addressing information indicated by the plurality of processed NVMe write commands is determined according to a namespace indicated by the plurality of processed NVMe write commands.
In an alternative embodiment of the present application, the storage command generating circuit identifies an NVMe write command to which a DMA command corresponding to the first message belongs in response to the first message, and generates a second message in response to identifying that all processing of the DMA command corresponding to the NVMe write command is completed, where the second message indicates that all processing of all DMA commands corresponding to the NVMe write command is completed.
In an alternative embodiment of the present application, the write initiation circuit further includes a completion message generation circuit coupled with the storage command generation circuit;
The completion message generation circuit provides the second message to the host in response to receiving the second message generated by the storage command generation circuit.
In an alternative embodiment of the present application, the write initiation circuit further comprises an error processing circuit, the error processing circuit being coupled with the storage command generation circuit;
The storage command generating circuit is used for generating a third message and providing the third message to the error processing circuit in response to the identification of the DMA command processing error indicated by the first message, or is used for generating a fourth message and providing the fourth message to the error processing circuit, wherein the third message is used for indicating the DMA command processing error indicated by the first message and the fourth message is used for indicating the NVMe write command processing error indicated by the second message;
The error processing circuit receives the third message or the fourth message and processes the third message or the fourth message.
In an alternative embodiment of the present application, the error processing circuits include a plurality of error processing circuits, where the plurality of error processing circuits are in one-to-one correspondence with the plurality of circuit branches, and the plurality of error processing circuits respectively receive third messages for indicating processing errors of the NVMe write commands processed by the circuit branches corresponding to the plurality of error processing circuits.
In an alternative embodiment of the present application, each circuit branch further includes an index buffer, where the index buffer is coupled to the DMA command generating circuit of the circuit branch where the index buffer is located, and the index buffer buffers the received addressing information in response to receiving the addressing information.
According to a second aspect of embodiments of the present application, there is provided a storage device, including a write initiator circuit according to any of the embodiments of the first aspect of the present application, through which data indicated by an NVMe write command is moved from a host to the storage device.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
FIG. 1A illustrates a block diagram of a solid state storage device;
FIG. 1B shows a block diagram of control components in a solid state storage device;
FIG. 1C shows a block diagram of a host command processing unit in a control unit in the related art;
FIG. 2 shows a schematic diagram of PRP indicated by NVMe write command
FIG. 3 is a schematic diagram of a write initiator circuit according to an embodiment of the application;
FIG. 4 is a schematic diagram of a write initiator circuit according to another embodiment of the present application;
FIG. 5A is a schematic diagram showing a write initiator circuit according to another embodiment of the present application;
FIG. 5B is a schematic diagram showing a write initiator circuit according to another embodiment of the present application;
FIG. 6 is a schematic diagram showing a write initiator circuit according to another embodiment of the present application;
FIG. 7 is a schematic diagram showing a write initiator circuit according to another embodiment of the present application;
FIG. 8 is a schematic diagram showing the structure of a write initiator circuit according to another embodiment of the present application;
FIG. 9 is a schematic diagram of a write initiator circuit processing a plurality of NVMe write commands received consecutively according to an embodiment of the present application;
FIG. 10 is a schematic diagram showing a write initiator circuit according to still another embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
For ease of understanding, the PRP and SGL will be briefly described below. PRP (PhysicalRegion Page, physical area page) and SGL (Scatter gather list) are two ways of describing data to be transferred between a host and a storage device. The NVMe write command sent by the host to the storage device needs to indicate the location where the data to be written is stored in the host memory, in addition to the length of the data to be written, where the location where the data to be written is stored in the host memory may be indicated by PRP or SGL. The NVMe write command indicates one or more RPR entries or SGL descriptors. The PRP entries indicate a fixed size of memory space in the host memory (e.g., memory blocks as shown in fig. 2), i.e., the size of memory space indicated by different PRP entries is the same, e.g., 4KB, 16KB, etc. As shown in fig. 2, the NVMe write command indicates three PRP entries (PRP 0, PRP1, and PRP 2), where PRP0 indicates a storage block a in host memory, PRP1 indicates a storage block B in host memory, and PRP2 indicates a storage block C in host memory, where the storage block a, the storage block B, and the storage block C are 4KB in size, and optionally, the storage block a, the storage block B, and the storage block C are not limited to 4KB in size. The SGL descriptor indicates an indefinite size of memory in host memory, e.g., NVMe write commands indicate SGL descriptors (SGL 1, SGL2, and SGL 3), where SGL1 indicates a 4KB of memory, SGL2 indicates a 2KB of memory, and SGL3 indicates a 3KB of memory.
Whether PRP entry or SGL descriptor, essentially describes one or more address spaces in host memory whose locations in host memory are arbitrary. The host carries a PRP or SGL related field in the NVMe write command, which may be the SGL or PRP itself, pointing to the host memory address space to be accessed, or a pointer, pointing to the SGL or PRP linked list, or even a pointer of the pointer. In either form, the storage device can always acquire the corresponding SGL or PRP according to the NVMe write command. Based on this, in an alternative embodiment, if the NVMe write command indicates a PRP (e.g., PRP entry itself or PRP pointer), the write initiator circuit receives the PRP index, a DMA command will be generated based on the PRP index, and if the NVMe write command indicates an SGL (e.g., SGL descriptor itself or SGL pointer), the write initiator circuit receives the SGL index, a DMA command will be generated based on the SGL index. It is to be appreciated that embodiments of the present application are not limited to describing the transfer of data between a host and a storage device in either PRP or SGL modes, but may also describe the data to be transferred in other similar functions.
For clarity purposes, the present application uses addressing information to indicate information obtained from NVMe write commands to address host memory. The addressing information may be, for example, a PRP index, PRP entry, SGL index, SGL descriptor, or the host memory address itself. Where the PRP entry or SGL descriptor carries the host memory address and the SGL index or PRP index represents the address of the SGL descriptor or PRP entry retrieved by, for example, the SGL/PRP unit and stored in the cache unit.
The following describes the structure of a write initiator circuit according to an embodiment of the present application by using a PRP description of a memory address of a host as an example of a host issuing an NVMe write command to a storage device. It is understood that in embodiments of the present application, PRP entries, DMA commands, and store commands may be in a one-to-one, one-to-many, many-to-one, or many-to-many relationship. For simplicity, in an embodiment according to the present application, the PRP entries, DMA commands, and store commands are described in a one-to-one correspondence.
FIG. 3 shows a schematic diagram of a write initiator circuit according to an embodiment of the application.
As shown in fig. 3, the write initiator circuit includes a DMA command generation circuit and a memory command generation circuit. By way of example, after the SGL/PRP unit (see also FIG. 1C) obtains, for example, a PRP entry from the NVMe write command, a PRP index indicating the PRP entry is provided to the write initiator circuit.
The DMA command generation circuit is coupled to the DMA transfer circuit, which generates a DMA command based on the PRP index in response to receiving the PRP index (process (1-1) in fig. 3), and provides the generated DMA command to the DMA transfer circuit (process (1-2) in fig. 3).
The PRP index has various forms, for example, the PRP index carries a host memory address indicated in the PRP entry, or the PRP index carries a pointer of a cache unit, the cache unit stores the PRP entry, and the PRP entry is obtained from the cache unit through the PRP index, so as to obtain the host memory address. Still alternatively, the PRP index corresponds one-to-one to the PRP entry. The acquisition of PRP entries from NVMe write commands and the acquisition of PRP indexes belongs to the prior art, and will not be described in detail here.
A DMA command is a command for controlling a DMA to perform a data transfer that indicates a mapping relationship of a host memory address space to a storage device cache address space (e.g., a DRAM address). The host memory address indicated by the DMA command is determined from the address space indicated by the SGL or PRP and the storage device cache address is allocated by the storage device. For NVMe write commands, the host memory address is the source address and the storage device cache address is the destination address. Each DMA command is used to perform a data transfer operation between the host and the storage device.
Generating a DMA command requires a host memory address as a source address and a storage device cache address as a destination address. And acquiring a host memory address according to the received PRP index, and allocating a storage space from the storage device cache to obtain the storage device cache address. Alternatively, the DMA command generation circuit allocates a storage device cache for the DMA command, e.g., the DMA command generation circuit obtains the available storage device cache address through a memory allocator (not shown).
The DMA transfer circuit moves data between the host memory and the storage device buffer according to the DMA command in response to the received DMA command, and after the data movement is completed, provides a message indicating that the DMA command processing is completed to the storage command generating circuit (as in process (2) in fig. 3).
The memory command generating circuit is coupled to the DMA transfer circuit and the memory command processing unit, and generates a memory command in response to receiving the DMA command processing completion message sent from the DMA transfer circuit, and sends the memory command to the memory command processing unit (process (3) in fig. 3). By way of example, a storage command causes a storage command processing unit to move data in a storage device cache to a storage device flash memory (e.g., an NVM chip) in accordance with the storage command. The storage command processing unit and the storage command are both in the prior art, and are not described herein.
The DMA command generation circuit and the memory command generation circuit operate in parallel and independently of each other. For example, when the DMA command generation circuit processes PRP_A, the storage command may be simultaneously processing a first message_A, the PRP_A corresponding to a first DMA command, and the first message_A corresponding to a second DMA command, i.e., the PRP_A corresponding to a different DMA command than the first message_A, and the first DMA command and the second DMA command may belong to the same or different NVMe write commands.
Optionally, the DMA command generation circuit is further coupled to the cache unit. The cache unit stores the PRP entry or the host memory address extracted from the PRP entry. Accordingly, the PRP index carries a pointer for the cache unit. The DMA command generating circuit also acquires a host memory address corresponding to the PRP index from the cache unit in response to receiving the PRP index.
Still alternatively, the DMA generation circuit, after generating the DMA command, also provides the DMA command or an index associated with the DMA command to the storage command generation circuit (as in process (1-3) of FIG. 3). The memory command generation circuit knows accordingly that it should receive a message from the DMA transfer circuit indicating that DMA command processing is complete. Thus, after receiving a message indicating that the DMA command processing is completed from the DMA module, the memory command generation circuit looks up the DMA command corresponding to the message to determine whether the processing of the DMA command is correct. Optionally, the storage command generating circuit further obtains a number of DMA commands corresponding to the NVMe write command associated with the DMA command accordingly, thereby identifying whether all DMA commands of the NVMe write command are processed to be completed.
FIG. 4 illustrates a schematic diagram of another write initiator circuit according to an embodiment of the present application.
As shown in fig. 4, the write initiator circuit includes a PRP acquisition circuit, a DMA command generation circuit, and a memory command generation circuit.
The PRP acquisition circuit acquires the PRP based on the PRP index in response to receiving the PRP index (process (1-1) in FIG. 4), and supplies the acquired PRP (process (1-1') in FIG. 4) to the DMA command generation circuit. The DMA command generating circuit receives the PRP supplied from the PRP acquiring circuit, generates a DMA command, and supplies the generated DMA command to the DMA transfer circuit (as in the process (1-2) in fig. 4). The DMA transfer circuit and the memory command generation circuit may refer to the embodiment shown in fig. 1, and the present application is not described herein.
Optionally, the PRP retrieval circuit is coupled to a cache unit in which the PRP entry or a host memory address extracted from the PRP entry is stored. Accordingly, the PRP index carries a pointer for indicating the cache unit. The PRP acquisition circuit is used for responding to the received PRP index and acquiring a PRP entry or a host memory address corresponding to the PRP index from the cache unit.
FIG. 5A shows a schematic diagram of a write initiate circuit according to another embodiment of the present application.
As shown in fig. 5A, the write initiate circuit includes one or more circuit branches, a DMA command cache, and a store command generation circuit arranged in parallel. Each circuit branch has the same circuit structure and each circuit branch comprises a DMA command generating circuit and a flow control circuit. Each circuit branch operates independently of the other. If there are multiple NVMe write commands processed at the same time, each NVMe write command indicates multiple PRP indexes, each circuit branch may process the PRP indexes corresponding to one or more NVMe write commands, and all the PRP indexes corresponding to each NVMe write command are sent to the same circuit branch, but multiple PRP indexes belonging to different NVMe write commands may provide the different circuit branches. If namespaces (NAME SPACE, abbreviated as NS) indicated by the plurality of processed NVMe write commands are the same, a circuit branch receives the PRP index corresponding to the plurality of processed NVMe write commands, and if namespaces indicated by the plurality of processed NVMe write commands are different, a different circuit branch receives the PRP index corresponding to the plurality of processed NVMe write commands, optionally, the circuit branch for processing the NVMe write commands may be determined according to the namespaces indicated by the NVMe write commands. The namespace (NAME SPACE, NS) is a collection of NVM (Non-Volatile Memory) that can be formatted into a number of logical blocks. One NVMe control can support a plurality of NS identified by different namespaces IDs (NSIDs for short). By way of example, the write initiate circuit in FIG. 5A includes four circuit branches, circuit branch 1, circuit branch 2, circuit branch 3, and circuit branch 4. It will be appreciated that the write initiate circuit may include other numbers of circuit branches.
As shown in fig. 5A, the DMA command generation circuit is coupled to the flow control circuit, which in response to receiving a PRP index (process (1-1) in fig. 5A), generates a DMA command based on the PRP index, and provides the generated DMA command to the flow control circuit (process (1-4) in fig. 5A).
The flow control circuit of each circuit branch is coupled to the DMA command buffer for controlling whether the DMA command generated by the DMA command generation circuit of the circuit branch where it is located is provided to the DMA command buffer (as in processes (1-5) of fig. 5A).
According to the embodiment of the application, the processing capacity of the DMA command generating process is improved by providing a plurality of circuit branches, so that the DMA command generating circuit of each circuit branch generates the DMA command according to the PRP index in parallel. And each circuit branch works independently, the work load of one circuit branch does not affect the performance of other circuit branches. Further, each circuit branch has a respective flow control circuit, so that flow control can be implemented for each circuit branch separately. For example, operation of one or more circuit branches may be suspended while other circuit branches are allowed to operate. The flow control circuits each have a bandwidth that can be configured, for example, so as to limit the number of DMA commands provided by each flow control circuit to the DMA command buffer per unit time. Each of the flow control circuits operates independently so that the flow control strategy of one flow control circuit does not affect the operation of the other flow control circuits.
Optionally, the flow control circuit implements flow control in accordance with the configured priority. For example, circuit branch 1, circuit branch 2, circuit branch 3, and circuit branch 4 are ordered in order of priority from high to low. When the circuit branch 1, the circuit branch 2 and the flow control circuit in the circuit branch 3 all receive the DMA command, the flow control circuit in the circuit branch 1 can control the priority of sending the received DMA command to the DMA command cache, so that the priority processing of the NVMe write command processed by the circuit branch 1 is realized. The streaming circuits in circuit branch 2 and circuit branch 3 may wait (e.g., suspend sending DMA commands to the DMA command cache) while the streaming circuits in circuit branch 1 send one or more DMA commands to the DMA command cache.
The DMA command buffer is coupled to the DMA transfer circuit. The DMA command buffer is used to buffer DMA commands received from the various circuit branches and to send the received DMA commands to the DMA transfer circuit (as in process (1-6) of fig. 5A). The DMA command buffer may receive more than one DMA command from the plurality of circuit branches, and the DMA transfer circuit may take time to process the DMA command, buffering the unprocessed DMA command when the DMA command buffer receives the plurality of DMA commands.
The DMA transfer circuit and the memory command generation circuit (e.g., processes (2), (3) in fig. 5A) may refer to the embodiment shown in fig. 3, and will not be described in detail herein.
In this embodiment, since multiple circuit branches may process multiple NVMe write commands in parallel, it is necessary to identify which NVMe write command a DMA command sent to an optional DMA transfer circuit belongs to, so that a subsequent storage command generating circuit can know the NVMe write command corresponding to the DMA command when receiving a message that DMA command processing is completed, and the DMA command generating circuit of this embodiment may further obtain corresponding CMD identification information based on a PRP index, where the CMD identification information is used to indicate the NVMe write command from which the PRP index comes.
Optionally, the storage command generating circuit may further generate a message indicating that the processing of the NVMe write command is completed in response to identifying that all the processing of the DMA command corresponding to a certain NVMe write command is completed, and send the message indicating that the processing of the NVMe write command is completed to the host interface (as in the process (4) in fig. 5A), thereby providing the host.
Optionally, the DMA command generation in the write initiate circuit shown in FIG. 5A may also be coupled to a cache unit. Still alternatively, referring to fig. 4, the write initiation circuit may further include a PRP acquisition circuit. Alternatively, the PRP acquisition circuit may be coupled to the cache unit.
FIG. 5B shows a schematic diagram of a write initiate circuit according to another embodiment of the present application.
As shown in fig. 5B, the flow control circuit of each circuit branch is coupled to a quota manager. And configuring a flow control strategy to the flow control circuit through the quota manager, and implementing the flow control by the flow control circuit according to the configured flow control strategy.
For example, the credit manager provides credits to each circuit branch separately, and the flow control circuit consumes the credits when sending DMA commands to the DMA command cache. The flow control circuit in each circuit branch controls the timing of sending DMA commands to the DMA command cache by checking the amount it owns. For example, the flow control circuit provides the received DMA command to the DMA command cache in response to the amount of credit that it owns being greater than a specified value, and reduces the value of the amount of credit that it owns accordingly (by, for example, the specified value). And when the amount of ownership is less than the specified value, the flow control circuit does not provide the DMA command to the DMA command cache. For another example, if the flow control circuit in the circuit branch 1 checks that the corresponding amount of the current circuit branch 1 is insufficient, the DMA command generated by the circuit branch 1 will not be sent to the DMA command buffer, and after waiting for the corresponding amount of the circuit branch 1 to be sufficient, the DMA command generated by the circuit branch 1 will be sent to the DMA command buffer.
Optionally, the credit manager provides the credit to each of the flow control circuits, and accordingly increases the credit owned by the circuit branch to which the credit is derived. The value of the credit increase depends on the credit value provided by the credit manager. Optionally, the credit manager provides the credit to the flow control circuit periodically or at specified occasions.
Alternatively, the credit provided by the credit manager for different circuit branches may be the same or different. For example, the credit manager provides more credit to circuit branch 1 than to circuit branches 2-4. Since circuit branch 1 is allocated more credit, the DMA commands generated by circuit branch 1 are provided to the DMA command cache with a greater chance than other circuit branches, or the probability that the DMA commands generated by circuit branch 1 of a circuit are blocked by its flow control circuit is relatively small. Because the DMA command is sent to the DMA command cache and the consumption of the line is needed, when the current residual line of a certain circuit branch is insufficient, the DMA command is not provided to the DMA command cache, but is provided to the DMA command cache after waiting for the line to be larger than a specified value, so that the control of which write commands are processed first according to the line of each circuit branch is realized, and the flow control of the write commands is realized.
According to the embodiment of the application, the bandwidth for generating the DMA command or the performance upper limit for generating the DMA command are independently configured for each circuit branch through the flow control manager, so that each circuit branch has an independent flow control strategy, and the influence of the load change of one circuit branch on the performance of other circuit branches is reduced.
Optionally, the circuit branches correspond to users, virtual machines, processes, or namespaces (NameSpaces). By providing multiple circuit branches and allocating or binding different circuit branches for different users, virtual machines, processes, or namespaces, resource reservation of the storage device is achieved, and performance impact of bursty loads of one instance of a user, virtual machine, process, or namespace on other user, virtual machine, process, or namespace instances is also avoided. For example, to allocate or bind different circuit branches to different users, virtual machines, processes, or namespaces, the user, virtual machine, process, or namespace associated with an NVMe write command (e.g., the user, virtual machine, or process that issued the NVMe write command, the namespace indicated by the NVMe write command) is identified and used as a basis to provide all PRP indices for the NVMe write command to the specified circuit branch or branches to generate the DMA command.
FIG. 6 is a schematic diagram showing a write initiator circuit according to still another embodiment of the present application.
As shown in fig. 6, the write initiation circuit further includes a completion message generation circuit on the basis of the embodiment shown in fig. 5B. The completion message generation circuit is coupled to the storage command generation circuit.
The storage command generating circuit generates a message indicating that the processing of an NVMe write command is completed in response to recognizing that all the processing of a DMA command corresponding to a certain NVMe write command is completed, and sends the message indicating that the processing of the NVMe write command is completed to the completion message generating circuit (as in process (4-1) in fig. 6). The completion message generation circuit transmits the NVMe write command processing completed message to the host interface in response to receiving the NVMe write command processing completed message (process (4-2) in fig. 6).
FIG. 7 is a schematic diagram showing a write initiator circuit according to still another embodiment of the present application.
As shown in fig. 7, the write initiation circuit further includes an error handling circuit on the basis of the embodiment shown in fig. 6. The error processing circuit is coupled to the memory command generation circuit.
The memory command generation circuit, in response to receiving the DMA command processing completion message, recognizes the DMA command processing error, generates a DMA command processing error message, and supplies the DMA command processing error message to the error processing circuit (as in process (3-1) in fig. 7). Or the storage command generation circuit, in response to receiving the DMA command processing completion message, recognizes an NVMe write command to which the DMA command corresponding to the DMA command processing completion message belongs, recognizes that the NVMe write command is processed in error, generates an NVMe write command processing error message, and supplies the write command processing error message to the error processing circuit (as in process (4-3) in fig. 7). The error handling circuitry processes the DMA command handling error message or the NVMe write command handling error message in response to receiving the DMA command handling error message or the NVMe write command handling error message. Alternatively, the error handling circuit may provide a DMA command handling error message or an NVMe write command handling error message to the CPU of the control unit (as shown in fig. 1B), or the error handling circuit may provide a DMA command handling error message or an NVMe write command handling error message to the host interface informing the host of the DMA command handling error or the NVMe write command handling error. Optionally, the error processing circuit may also perform other processing on the message of the processing error of the NVMe write command according to the prior art, which is not described herein.
In an alternative embodiment, the number of the error processing circuits may be plural, and the plural error processing circuits are respectively corresponding to the plural circuit branches one by one. Each error processing circuit receives a message indicating an NVMe write command processing error processed by its corresponding circuit branch. Referring to circuit branches 1-4 shown in fig. 5A, the write initiate circuit may include 4 error handling circuits, error handling circuit 1, error handling circuit 2, error handling circuit 3, error handling circuit 4, corresponding to circuit branches 1-4, respectively. The completion message generation circuit supplies a message indicating a DMA command processing error message or an NVMe write command processing error handled by the circuit branch 1 to the error handling circuit 1, a message indicating a DMA command processing error message or an NVMe write command processing error handled by the circuit branch 2 to the error handling circuit 2, a message indicating a DMA command processing error message or an NVMe write command processing error handled by the circuit branch 3 to the error handling circuit 3, and a message indicating a DMA command processing error message or an NVMe write command processing error handled by the circuit branch 4 to the error handling circuit 4.
By providing a plurality of error processing circuits corresponding to each circuit branch, the message of the processing error of the DMA command or the message of the processing error of the NVMe write command processed by each circuit branch can be respectively acquired, so that the processing process of the message of the processing error of the NVMe write command can be respectively implemented with flow control. For example, error handling for one circuit branch may be prioritized, while error handling for another circuit branch may be deferred.
FIG. 8 is a schematic diagram showing a write initiator circuit according to still another embodiment of the present application.
As shown in fig. 8, each circuit branch of the write initiator circuit includes an index cache (cache 1-cache 4 as shown in fig. 8) on the basis of the embodiment shown in fig. 7. The index cache is coupled to the DMA command generation circuit of the circuit branch in which it resides. The index cache is responsive to receiving the PRP index, to cache the received PRP index. The received PRP index is cached by an index cache such that the rate at which the write initiate circuit generates DMA commands matches the rate of the PRP index.
Taking the circuit branch 1 shown in fig. 8 as an example, the processing procedure of the write initiator circuit according to the embodiment of the present application is described:
process (1-1) provides for writing one or more PRP indices to cache 1.
Process (1-b) buffer 1 provides the buffered one or more PRP indices to the DMA command generation circuit.
The process (1-4) is that the DMA command generation circuit generates a DMA command based on the received PRP index and provides the generated DMA command to the flow control circuit. The DMA command is used to instruct the transfer of data (e.g., 4 KB) in one memory block from the host to the memory device (e.g., DRAM).
The process (1-5) is that the flow control circuit controls whether to provide the received DMA command to the DMA command buffer according to the configured flow control strategy.
The process (1-6) is that the DMA command cache provides the cached DMA command to the DMA transfer circuit.
In the process (2), the DMA transfer circuit responds to the receipt of the DMA command and processes the DMA command (for example, moves the 4K data indicated by the DMA command from the host memory to the DRAM of the storage device), and after the DMA command processing is completed, generates a message of the completion of the DMA command processing and sends the message to the storage command generating circuit.
In the process (3), the memory command generating circuit generates a memory command (corresponding to 4KB data) in response to receiving a message that the processing of the DMA command is completed, for example, a message that the processing of the DMA command 1 is completed, that is, data (e.g. 4 KB) indicated by the DMA command 1 is moved to the DRAM, and sends the generated memory command to the memory command processing unit.
And a process (3-1) wherein the memory command generation circuit generates a DMA command processing error message and supplies the DMA command to the error processing circuit when recognizing a DMA command processing error, for example, a DMA command 1 processing error, corresponding to the message indicating that the DMA command processing has been completed.
And (4-1) the storage command generating circuit generates a message indicating that the processing of the NVMe write command is completed in response to identifying that the DMA command corresponding to a certain NVMe write command is completed, and provides the message indicating that the processing of the NVMe write command is completed to the completion message generating circuit.
The process (4-2) is that the completion message generation circuit transmits the NVMe write command processing completed message to the host interface in response to receiving the NVMe write command processing completed message.
And (4-3) the storage command generating circuit generates an NVMe write command processing error message and provides the NVMe write command processing error message to the error processing circuit if the storage command generating circuit recognizes that a certain NVMe write command processing error, such as the NVMe write command 1 processing error.
A process (5) (not shown) in which the error handling circuit processes the DMA command handling error message or the NVMe write command handling error message in response to receiving the DMA command handling error message or the NVMe write command handling error message, for example, provides the DMA command handling error message or the NVMe write command handling error message to the CPU of the control unit, or informs the host of the occurrence of an error in an NVMe read command handling completion message through the host interface.
According to the write initiating circuit provided by the embodiment of the application, each part of the write initiating circuit respectively and independently works in parallel, so that the write initiating circuit can process a plurality of NVMe write commands and a plurality of PRP indexes corresponding to the NVMe write commands at the same time.
Fig. 9 shows a case where the write initiator circuit of the embodiment of the present application processes a plurality of PRP indexes received consecutively. In fig. 9, T0 to T5 denote successive time periods, and the contents below each time period denote the operation contents performed by the respective circuits in that time period.
In the T0 period, the PRP acquiring circuit acquires the PRP entry PRP0 corresponding to the PRP0 index in response to receiving the PRP0 index, and sends the PRP0 to the DMA command generating circuit.
In the period of T1, the PRP acquisition circuit acquires the PRP entry PRP1 corresponding to the PRP1 index in response to receiving the PRP1 index. Meanwhile, the DMA command generating circuit generates a corresponding DMA command 0 based on PRP0, and supplies the DMA command 0 to the DMA transfer circuit.
In the period of T2, the PRP acquisition circuit acquires the PRP entry PRP2 corresponding to the PRP2 index in response to receiving the PRP2 index. Meanwhile, the DMA command generating circuit generates a corresponding DMA command 1 based on PRP1, and the DMA transfer circuit executes DMA command 0.
In the period T3, the DMA command generation circuit generates a corresponding DMA command 2 based on PRP2, and at the same time, the DMA transfer circuit executes DMA command 1, and the memory command generation circuit generates memory command 0 in response to receiving a DMA command 0 process completion message.
In the T4 period, the DMA transfer circuit executes the DMA command 2, and at the same time, the storage command generation circuit generates the storage command 1 in response to receiving the DMA command 1 processing completion message.
In the T5 period, the storage command generation circuit generates the storage command 2 in response to receiving the DMA command 2 processing completion message.
PRP index 0 through PRP index 2 shown in fig. 9 may be from the same or different NVMe write commands. The DMA command generation circuit generates a DMA command according to the PRP index without being affected by other factors such as the order of the PRP index in the NVMe write command. Even if the PRP indexes received by the DMA command generating circuit sequentially belong to different NVMe write commands, the PRP indexes received sequentially belong to the same NVMe write command but have different sequences from the sequences described by the NVMe write commands, or the PRP indexes of other NVMe write commands are received before all the PRP indexes of one NVMe write command are completely received, the DMA command generating circuit processes and generates the DMA commands according to the method provided by the embodiment of the present application.
The timing at which the write initiator circuit receives the PRP index is arbitrary, and the write initiator circuit does not control the timing at which the PRP index is received, and the order or time relation in which the plurality of PRP indexes are received. The order or time relation in which the plurality of PRP indexes are received does not affect the processing of the write initiator circuit. The time for each of the PRP acquiring circuit, the DMA command generating circuit, and the memory command generating circuit to execute the own processing is not limited to 1 cycle shown in fig. 9, and the time for each of the PRP acquiring circuit, the DMA command generating circuit, and the memory command generating circuit to execute the own processing may be the same or different.
The DMA transfer circuit also processes multiple DMA commands concurrently, which is prior art and not described in detail. The write initiation circuitry according to embodiments of the present application operates effectively even if, for example, the communication link between the host and the storage device changes in processing time of the DMA command due to hibernation or other reasons. The processing delay variation for one DMA command does not affect the processing of other DMA commands. The memory command generating circuit of the embodiment of the application responds to the processing completion message of each DMA command and is not influenced by the processing completion message of other DMA commands.
The embodiments shown in fig. 3-8 illustrate a write initiator circuit using a PRP index as an example, and the processing procedure of the write initiator circuit for the PRP index is equally applicable to the SGL index. For the SGL index, the DMA command generation circuit generates a DMA command based on the SGL index in response to receiving the SGL index, and provides the generated DMA command to the DMA transfer circuit. The DMA transfer circuit is responsive to the received DMA command to transfer data between the host memory and the storage device cache in accordance with the DMA command and, after the data transfer is completed, to provide a message to the storage command generation circuit indicating that the processing of the DMA command is complete. The storage command generating circuit generates a storage command in response to receiving the DMA command processing completion message sent by the DMA transfer circuit, and sends the storage command to the storage command processing unit. That is, the processing procedure of the write initiator circuit in the embodiment of the present application for receiving the PRP index or the SGL index is similar, and the present application is not described here again.
Similar to the PRP index, the SGL index also has various forms, for example, the SGL index carries a host memory address indicated in the SGL descriptor, or the SGL index carries a pointer of a cache unit, in which the SGL descriptor is stored, and the SGL descriptor is obtained from the cache unit through the SGL index, so as to obtain the host memory address. Still alternatively, the SGL indices are in one-to-one correspondence with SG L descriptors. The acquisition of SGL descriptors from NVMe write commands and the acquisition of SGL indices belongs to the prior art, and will not be described in detail here.
Referring to the write initiate circuit shown in fig. 4, the write initiate circuit further includes an SGL fetch circuit as shown in fig. 10. The SGL acquisition circuit acquires the SGL descriptor based on the SGL index in response to receiving the SGL index, and supplies the acquired SGL to the DMA command generation circuit.
Optionally, the SGL acquisition circuit may also be coupled to a cache unit. The cache unit stores the SGL descriptor or the host memory address extracted from the SGL descriptor. Accordingly, the SGL index carries a pointer for indicating the cache location. The SGL acquisition circuit acquires an SGL descriptor or host memory address corresponding to the SGL index from the cache unit in response to receiving the SGL index.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application. It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

Translated fromChinese
1.一种写入发起电路,其特征在于,包括:DMA命令生成电路和存储命令生成电路;1. A write initiation circuit, characterized in that it comprises: a DMA command generation circuit and a storage command generation circuit;所述DMA命令生成电路与DMA传输电路耦合,所述DMA命令生成电路响应于接收到寻址信息,基于所述寻址信息,生成DMA命令,并将所述DMA命令提供给所述DMA传输电路;所述DMA命令指示所述DMA传输电路根据所述DMA命令执行从主机内存到存储设备缓存的数据搬移;The DMA command generation circuit is coupled to the DMA transmission circuit, and in response to receiving the addressing information, the DMA command generation circuit generates a DMA command based on the addressing information and provides the DMA command to the DMA transmission circuit; the DMA command instructs the DMA transmission circuit to perform data movement from the host memory to the storage device cache according to the DMA command;所述存储命令生成电路与所述DMA传输电路耦合,所述存储命令生成电路响应于接收到所述DMA传输电路发送的第一消息,生成存储命令,将所述存储命令提供给所述存储命令处理单元;其中,所述第一消息用于指示所述DMA传输电路执行的DMA命令处理完成。The storage command generation circuit is coupled to the DMA transmission circuit. In response to receiving a first message sent by the DMA transmission circuit, the storage command generation circuit generates a storage command and provides the storage command to the storage command processing unit; wherein the first message is used to indicate that the DMA command processing performed by the DMA transmission circuit is completed.2.根据权利要求1所述的写入发起电路,其特征在于,所述DMA命令生成电路还与缓存单元耦合,所述缓存单元用于缓存所述寻址信息,所述寻址信息包括以下一项或多项:PRP条目、SGL条目、根据所述PRP索引或所述SGL索引确定的主机内存地址。2. The write initiation circuit according to claim 1 is characterized in that the DMA command generation circuit is also coupled to a cache unit, and the cache unit is used to cache the addressing information, and the addressing information includes one or more of the following: a PRP entry, an SGL entry, and a host memory address determined according to the PRP index or the SGL index.3.根据权利要求1或2所述的写入发起电路,其特征在于,所述写入发起电路还包括PRP条目或SGL条目获取电路,所述PRP条目或SGL条目获取电路与所述DMA命令生成电路耦合;3. The write initiation circuit according to claim 1 or 2, characterized in that the write initiation circuit further comprises a PRP entry or SGL entry acquisition circuit, and the PRP entry or SGL entry acquisition circuit is coupled to the DMA command generation circuit;所述PRP条目或SGL条目获取电路响应于接收到寻址信息,基于所述寻址信息获取PRP条目或SGL条目,并将所述PRP条目或SGL条目提供给所述DMA命令生成电路。In response to receiving the addressing information, the PRP entry or SGL entry acquisition circuit acquires the PRP entry or SGL entry based on the addressing information and provides the PRP entry or SGL entry to the DMA command generation circuit.4.根据权利要求1-3任一项所述的写入发起电路,其特征在于,所述写入发起电路包括并行设置的一条或多条电路支路、DMA命令缓存,每条所述电路支路包括DMA命令生成电路和流控电路;4. The write initiation circuit according to any one of claims 1 to 3, characterized in that the write initiation circuit comprises one or more circuit branches and a DMA command buffer arranged in parallel, and each of the circuit branches comprises a DMA command generation circuit and a flow control circuit;每条所述电路支路的所述流控电路与其所在电路支路的DMA命令生成电路耦合,还与所述DMA命令缓存耦合,用于控制是否将其所在电路支路的DMA命令生成电路生成的DMA命令提供给DMA命令缓存;The flow control circuit of each circuit branch is coupled to the DMA command generation circuit of the circuit branch where it is located, and is also coupled to the DMA command cache, and is used to control whether to provide the DMA command generated by the DMA command generation circuit of the circuit branch where it is located to the DMA command cache;所述DMA命令缓存还与所述DMA传输电路耦合,用于缓存接收到的DMA命令,并将缓存的DMA命令提供给所述DMA传输电路。The DMA command buffer is also coupled to the DMA transmission circuit, and is used to buffer received DMA commands and provide the buffered DMA commands to the DMA transmission circuit.5.根据权利要求4所述的写入发起电路,其特征在于,每条所述电路支路的流控电路还与配额管理器耦合,每条所述电路支路的流控电路根据所述配额管理器配置的流控策略控制是否将其所在电路支路的DMA命令生成电路生成的DMA命令提供给DMA命令缓存。5. The write initiation circuit according to claim 4 is characterized in that the flow control circuit of each of the circuit branches is also coupled to the quota manager, and the flow control circuit of each of the circuit branches controls whether to provide the DMA command generated by the DMA command generation circuit of the circuit branch to which it is located to the DMA command cache according to the flow control strategy configured by the quota manager.6.根据权利要求4或5所述的写入发起电路,其特征在于,若同时存在多个被处理的NVMe写命令,每个所述NVMe写命令指示多个所述寻址信息,针对每条电路支路,所述电路支路接收同一NVMe写命令所指示的多个寻址信息。6. The write initiation circuit according to claim 4 or 5 is characterized in that if there are multiple NVMe write commands being processed at the same time, each of the NVMe write commands indicates multiple addressing information, and for each circuit branch, the circuit branch receives multiple addressing information indicated by the same NVMe write command.7.根据权利要求6所述的写入发起电路,其特征在于,若所述多个被处理的NVMe写命令指示的命名空间相同,则由同一条所述电路支路接收所述多个被处理的NVMe写命令所指示的寻址信息;若所述多个被处理的NVMe写命令指示的命名空间不同,则由不同的所述电路支路接收所述多个被处理的NVMe写命令所指示的寻址信息。7. The write initiation circuit according to claim 6 is characterized in that if the namespaces indicated by the multiple processed NVMe write commands are the same, the addressing information indicated by the multiple processed NVMe write commands is received by the same circuit branch; if the namespaces indicated by the multiple processed NVMe write commands are different, the addressing information indicated by the multiple processed NVMe write commands is received by different circuit branches.8.根据权利要求7所述的写入发起电路,其特征在于,接收所述多个被处理的NVMe写命令所指示的寻址信息的所述电路支路,根据所述多个被处理的NVMe写命令指示的命名空间确定。8. The write initiation circuit according to claim 7 is characterized in that the circuit branch that receives the addressing information indicated by the multiple processed NVMe write commands is determined according to the namespace indicated by the multiple processed NVMe write commands.9.根据权利要求1-8任一项所述的写入发起电路,其特征在于,9. The write initiation circuit according to any one of claims 1 to 8, characterized in that:所述存储命令生成电路响应于所述第一消息,识别所述第一消息对应的DMA命令所属的NVMe写命令,响应于识别到NVMe写命令对应的DMA命令全部处理完成,生成第二消息,所述第二消息指示所述NVMe写命令对应的所有DMA命令全部处理完成。The storage command generation circuit responds to the first message, identifies the NVMe write command to which the DMA command corresponding to the first message belongs, and generates a second message in response to identifying that all DMA commands corresponding to the NVMe write command have been processed, wherein the second message indicates that all DMA commands corresponding to the NVMe write command have been processed.10.根据权利要求9所述的写入发起电路,其特征在于,所述写入发起电路还包括错误处理电路,所述错误处理电路与所述存储命令生成电路耦合;10. The write initiation circuit according to claim 9, characterized in that the write initiation circuit further comprises an error handling circuit, and the error handling circuit is coupled to the storage command generation circuit;所述存储命令生成电路响应于识别到所述第一消息指示的DMA命令处理错误,生成第三消息,将所述第三消息提供给所述错误处理电路;或,所述存储命令生成电路识别到所述第二消息指示的NVMe写命令处理错误,生成第四消息,将所述第四消息提供给所述错误处理电路;其中,所述第三消息用于指示所述第一消息指示的DMA命令处理错误,所述第四消息用于指示所述第二消息指示的NVMe写命令处理错误;The storage command generation circuit generates a third message in response to identifying the DMA command processing error indicated by the first message, and provides the third message to the error processing circuit; or, the storage command generation circuit identifies the NVMe write command processing error indicated by the second message, generates a fourth message, and provides the fourth message to the error processing circuit; wherein the third message is used to indicate the DMA command processing error indicated by the first message, and the fourth message is used to indicate the NVMe write command processing error indicated by the second message;所述错误处理电路接收所述第三消息或所述第四消息,对所述第三消息或所述第四消息进行处理。The error processing circuit receives the third message or the fourth message, and processes the third message or the fourth message.
CN202310788448.3A2023-06-292023-06-29 NVMe write IO initiator circuit of NVMe controllerPendingCN119229912A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202310788448.3ACN119229912A (en)2023-06-292023-06-29 NVMe write IO initiator circuit of NVMe controller

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202310788448.3ACN119229912A (en)2023-06-292023-06-29 NVMe write IO initiator circuit of NVMe controller

Publications (1)

Publication NumberPublication Date
CN119229912Atrue CN119229912A (en)2024-12-31

Family

ID=94039145

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202310788448.3APendingCN119229912A (en)2023-06-292023-06-29 NVMe write IO initiator circuit of NVMe controller

Country Status (1)

CountryLink
CN (1)CN119229912A (en)

Similar Documents

PublicationPublication DateTitle
CN113377283B (en)Memory system with partitioned namespaces and method of operation thereof
CN107885456B (en)Reducing conflicts for IO command access to NVM
US8924659B2 (en)Performance improvement in flash memory accesses
US11762590B2 (en)Memory system and data processing system including multi-core controller for classified commands
CN113468083B (en)Dual-port NVMe controller and control method
CN109164976B (en)Optimizing storage device performance using write caching
CN110554833B (en)Parallel processing IO commands in a memory device
WO2019062202A1 (en)Method, hard disk, and storage medium for executing hard disk operation instruction
CN113032293A (en)Cache manager and control component
US10671307B2 (en)Storage system and operating method thereof
WO2018024214A1 (en)Io flow adjustment method and device
CN108958642A (en)Storage system and its operating method
CN111258932A (en) Method and storage controller for accelerating UFS protocol processing
CN114253461A (en) Mixed channel storage device
KR20230034194A (en)Dual mode storage device
CN213338708U (en)Control unit and storage device
CN113867615A (en)Intelligent cache allocation method and control component
CN113485643B (en)Method for data access and controller for data writing
CN112148626A (en)Storage method and storage device for compressed data
CN107885667B (en)Method and apparatus for reducing read command processing delay
CN113805813B (en)Method and apparatus for reducing read command processing delay
CN114253462A (en)Method for providing mixed channel memory device
CN113031849A (en)Direct memory access unit and control unit
CN119512984A (en) Method for efficiently utilizing a cache unit for caching PRP in a storage device
CN119718727A (en) Cache unit management method and storage device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp