Movatterモバイル変換


[0]ホーム

URL:


CN111711566A - Receiver out-of-order rearrangement method in multi-path routing scenario - Google Patents

Receiver out-of-order rearrangement method in multi-path routing scenario
Download PDF

Info

Publication number
CN111711566A
CN111711566ACN202010629313.9ACN202010629313ACN111711566ACN 111711566 ACN111711566 ACN 111711566ACN 202010629313 ACN202010629313 ACN 202010629313ACN 111711566 ACN111711566 ACN 111711566A
Authority
CN
China
Prior art keywords
bitmap
value
msn
packet
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010629313.9A
Other languages
Chinese (zh)
Other versions
CN111711566B (en
Inventor
顾华玺
刁兴龙
相希睿
余晓杉
朱李晶
徐晓琪
马天阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian UniversityfiledCriticalXidian University
Priority to CN202010629313.9ApriorityCriticalpatent/CN111711566B/en
Publication of CN111711566ApublicationCriticalpatent/CN111711566A/en
Application grantedgrantedCritical
Publication of CN111711566BpublicationCriticalpatent/CN111711566B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention discloses a receiving end disorder rearrangement method applied to a multipath routing scene of a computing network, which mainly solves the problem of a large amount of redundant retransmission caused by the existing go-back-N mechanism and has the scheme that: constructing a receiving bitmap linked list and a sending bitmap linked list in a network card; when the request packet reaches the network card of the receiving end, the receiving end determines the serial number of the response packet, compares the expected receiving serial number of the receiving queue with the serial number of the request packet, and caches or executes or discards the request packet according to the comparison result; when the response packet reaches the network card of the sending end, the sending end determines the request packet, and determines to discard the response packet or edit the work queue element according to the type of the response packet, the MSN message number and the bitmap entry corresponding to the sending queue. The invention reduces the occupancy rate of network card storage resources, reduces the redundant retransmission of the sending end, improves the effective throughput, and can be used for the high-performance computing network to process the data of the receiving end under the multipath routing scene.

Description

Translated fromChinese
多路径路由场景下的接收端乱序重排方法Receiver out-of-order rearrangement method in multi-path routing scenario

技术领域technical field

本发明属于计算网络技术领域,特别涉及一种接收端乱序重排方法,可用于高性能计算网络对多路径路由场景下的接收端数据处理。The invention belongs to the technical field of computing networks, and in particular relates to a method for out-of-order rearrangement of a receiving end, which can be used for data processing of a receiving end in a scenario of multi-path routing by a high-performance computing network.

背景技术Background technique

高性能计算HPC网络虽然极力保证无丢包,但在多路径路由场景下,先被发送的数据包仍然可能晚于后被发送的数据包到达接收端,进而导致数据包乱序问题。Although high-performance computing (HPC) networks try to ensure no packet loss, in multi-path routing scenarios, packets sent first may still arrive at the receiver later than packets sent later, resulting in out-of-order packets.

当前HPC网络采用go-back-N机制处理数据包乱序问题,接收端一旦判定某个到达的数据包为乱序包,即到达的数据包的序列号与接收端的期待接收号不匹配,则立即丢弃该数据包,并向发送端反馈一个NACK包。发送端收到NACK包后,不仅重传该NACK包指示的序列号对应的数据,而且重传该序列号之后的所有数据,将带来大量的冗余重传。Go-back-N机制对于乱序数据包的处理机制过于激烈,会触发严重的网络开销,The current HPC network adopts the go-back-N mechanism to deal with the problem of out-of-order data packets. Once the receiving end determines that an arriving data packet is an out-of-order packet, that is, the serial number of the arriving data packet does not match the expected receiving number of the receiving end, then Immediately discard the data packet and send back a NACK packet to the sender. After receiving the NACK packet, the sender not only retransmits the data corresponding to the sequence number indicated by the NACK packet, but also retransmits all data after the sequence number, which will bring a large number of redundant retransmissions. The Go-back-N mechanism is too aggressive for out-of-order packets, which will trigger serious network overhead.

Yuanwei Lu等人在其发表的文章“Multi-Path Transport for RDMA inDatacenters”(15th USENIX Symposium on Networked Systems Design andImplementation(NSDI’18)pp.362)中提出一种基于MP-RDMA的可感知乱序多路径选择算法,其具体实现是:首先利用位图跟踪乱序的数据包并记录数据包的到达状态;再将数据包缓存至在主机处开辟的额外存储空间;然后在收端主动删除慢路径,只选择延迟相似的快路径,控制无序数据包的发生度。该方法存在的不足之处是,需要为乱序数据包开辟额外存储空间,且数据查询等操作会增加时延开销,不利于网卡快速响应,难以满足高性能计算网络的低时延需求。Yuanwei Lu et al. in their article "Multi-Path Transport for RDMA in Datacenters" (15th USENIX Symposium on Networked Systems Design and Implementation (NSDI'18) pp.362) proposed an MP-RDMA-based perceptual out-of-order multi- The specific implementation of the path selection algorithm is: firstly use the bitmap to track the out-of-order data packets and record the arrival status of the data packets; then cache the data packets to the extra storage space opened up at the host; then actively delete the slow path at the receiving end , select only fast paths with similar delays, and control the occurrence of out-of-order packets. The disadvantage of this method is that it needs to open up additional storage space for out-of-order data packets, and operations such as data query will increase the delay overhead, which is not conducive to the rapid response of the network card, and it is difficult to meet the low-latency requirements of high-performance computing networks.

Jianjun Hu在其发表的文章“Study on an Improved GO-BACK-N ARQ Policy”(Computer Applications and Software vol.28,No.7,pp.230-232.2011.6)中提出一种改进的Go-back-N ARQ方法,其在发端建立缓冲区,以实现对出错或丢失数据包的快速重传,在收端引入超时机制,避免系统僵持。收端每收到一个出错数据包,则启动超时,丢弃该包及之后的若干包;每收到一个乱序数据包,则丢弃该包。该方法存在的不足之处是,提供的方案仅适用于普通计算网络,且在原通信协议基础上的改进对网络吞吐量和时延改善程度有限,在要求高吞吐量和低延迟的高性能计算网络中,表现不佳。Jianjun Hu proposed an improved Go-back in his article "Study on an Improved GO-BACK-N ARQ Policy" (Computer Applications and Software vol.28, No.7, pp.230-232.2011.6) -N ARQ method, which establishes a buffer at the sending end to realize fast retransmission of erroneous or lost data packets, and introduces a timeout mechanism at the receiving end to avoid system stalemate. Each time the receiving end receives an error packet, it starts a timeout, and discards the packet and several subsequent packets; each time it receives an out-of-order data packet, it discards the packet. The disadvantage of this method is that the provided solution is only suitable for ordinary computing networks, and the improvement on the basis of the original communication protocol has limited improvement in network throughput and delay. In the network, the performance is not good.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对上述现有技术的不足,提出一种多路径路由场景下的接收端乱序重排方法,以降低对网卡存储资源的占用率,减少发端的冗余重传,提高计算网络中的有效吞吐量。The purpose of the present invention is to propose a method for out-of-order reordering of the receiving end in a multi-path routing scenario, so as to reduce the occupancy rate of network card storage resources, reduce redundant retransmissions at the originating end, and improve computing power. Effective throughput in the network.

为实现上述目的,本发明的实现步骤包括如下:To achieve the above object, the implementation steps of the present invention include the following:

(1)在每个网络节点的网卡中构建一张接收位图链表,并在该表中为每个处于活动状态的接收队列分配一个位图条目,用以记录接收队列的请求包到达状态;(1) Construct a receiving bitmap linked list in the network card of each network node, and allocate a bitmap entry for each active receiving queue in the table to record the request packet arrival status of the receiving queue;

(2)在每个网络节点的网卡中构建一张发送位图链表,并在该表中为每个处于活动状态的发送队列分配一个位图条目,用以记录发送队列的响应包到达状态;(2) Construct a sending bitmap linked list in the network card of each network node, and allocate a bitmap entry for each active sending queue in the table to record the arrival status of the response packet of the sending queue;

(3)根据网卡在网络中的位置,对网卡中的每个发送端和每个响应端分别执行不同的的操作:(3) According to the position of the network card in the network, perform different operations on each sender and each responder in the network card:

对接收端,执行(4);For the receiving end, execute (4);

对发送端,执行(5);For the sender, execute (5);

(4)在请求包到达接收端的网卡时,接收端将累积已发出的请求包数目作为请求包的请求序列号RPSN,并将接收队列的期待接收序列号eRPSN与RPSN进行比较,根据比较结果对请求包进行缓存或执行或丢弃操作,并确定返回发送端的响应包响应序列号APSN;(4) When the request packet arrives at the network card of the receiving end, the receiving end will accumulate the number of request packets that have been sent as the request sequence number RPSN of the request packet, and compare the expected receiving sequence number eRPSN of the receiving queue with the RPSN. The request packet is cached or executed or discarded, and the response sequence number APSN of the response packet returned to the sender is determined;

(5)在响应包到达发送端的网卡时,发送端将累积已执行的响应包数目作为响应序列号APSN,并根据响应包类型、响应包的MSN消息号和发送队列对应的位图条目,决定对该响应包进行二选一操作,即丢弃,或按照发送端可完成的消息号编辑工作队列元素WQE。(5) When the response packet arrives at the network card of the sender, the sender takes the accumulated number of executed response packets as the response sequence number APSN, and decides according to the type of the response packet, the MSN message number of the response packet and the bitmap entry corresponding to the sending queue. One of two operations is performed on the response packet, that is, discarding, or editing the work queue element WQE according to the message number that can be completed by the sender.

本发明与现有技术相比具有如下优点:Compared with the prior art, the present invention has the following advantages:

第一,对网卡存储资源的占用率低。First, the occupancy rate of network card storage resources is low.

本发明由于采用位图链表,仅记录乱序数据包的到达状态,仍正常执行乱序数据包的请求/响应内容,可节省网卡存储开销,降低了网卡的资源占用率;Since the invention adopts the bitmap linked list, only the arrival state of the out-of-order data packet is recorded, and the request/response content of the out-of-order data packet is still normally executed, which can save the storage overhead of the network card and reduce the resource occupancy rate of the network card;

第二,减少了发送端的冗余重传,提高了有效吞吐量。Second, redundant retransmissions at the sender are reduced, and the effective throughput is improved.

本发明由于在接收端收到某条消息的请求包后,判断请求包是否乱序到达,并对乱序到达的请求包,不直接丢弃乱序包,而是采用位图记录乱序数据包到达状态,缓存可用的乱序包,从而避免了发送端的大量冗余重传,进而提高了网络的有效吞吐量。In the present invention, after the receiving end receives the request packet of a certain message, it determines whether the request packet arrives out of sequence, and for the request packet that arrives out of sequence, it does not directly discard the out of sequence packet, but uses a bitmap to record the out of sequence data packet When reaching the state, the available out-of-order packets are buffered, thereby avoiding a large number of redundant retransmissions at the sender, thereby improving the effective throughput of the network.

附图说明Description of drawings

图1是本发明的实现总流程示意图;Fig. 1 is the realization general flow schematic diagram of the present invention;

图2是本发明中请求包和响应包的数据包序列号编号示意图;Fig. 2 is the schematic diagram of the packet sequence number numbering of request packet and response packet in the present invention;

图3是本发明中接收端乱序重排子流程示意图;3 is a schematic diagram of a sub-flow diagram of out-of-order rearrangement at the receiving end in the present invention;

图4是本发明中发送端乱序重排子流程示意图。FIG. 4 is a schematic diagram of a sub-flow of out-of-order rearrangement at the transmitter in the present invention.

具体实施方式Detailed ways

以参照附图对本发明的实施例作进一步详细描述。Embodiments of the present invention will be described in further detail with reference to the accompanying drawings.

参照图1,本实例的实现步骤如下:Referring to Figure 1, the implementation steps of this example are as follows:

步骤1,构建接收位图链表。Step 1, build a linked list of receiving bitmaps.

在每个网络节点的网卡中构建一张接收位图链表,并在该表中为每个处于活动状态的接收队列分配一个位图条目,用以记录接收队列的请求包到达状态。其中:A receive bitmap linked list is constructed in the network card of each network node, and a bitmap entry is allocated to each active receive queue in the table to record the arrival status of the request packets of the receive queue. in:

所述每个网络节点的网卡,均有多个接收端,每个接收端各自维护一个接收队列,用以记录收到的请求包的信息;The network card of each network node has a plurality of receiving ends, and each receiving end maintains a receiving queue for recording the information of the received request packet;

所述每个接收位图链表,均包含若干个位图条目;Each of the received bitmap linked lists includes several bitmap entries;

所述每个接收队列,各自维护next_APSN和MSN多个变量:Each of the receiving queues maintains multiple variables of next_APSN and MSN:

该next_APSN变量,用于封装响应包的序列号字段;The next_APSN variable is used to encapsulate the sequence number field of the response packet;

该MSN变量,在响应包从接收端发出时,接收端将其赋值给响应包首部的MSN字段,以将接收端的MSN变量信息通过响应包传递至发送端,该MSN变量的值也作为响应包MSN消息号使用;The MSN variable, when the response packet is sent from the receiving end, the receiving end assigns it to the MSN field of the response packet header, so as to transmit the MSN variable information of the receiving end to the sending end through the response packet, and the value of the MSN variable is also used as the response packet. MSN message number usage;

所述位图条目,由接收队列标识、接收位图空隙数组和接收位图首部这三部分组成,该接收队列标识,用于标记对应位图条目,方便后续操作在接收位图链表中索引该位图条目;The bitmap entry is composed of three parts: the receiving queue identifier, the receiving bitmap gap array, and the receiving bitmap header. The receiving queue identifier is used to mark the corresponding bitmap entry, which is convenient for subsequent operations to index the received bitmap linked list. bitmap entry;

该接收位图空隙数组,包含N个位图空隙元素,N≥1,用于记录到达接收端的数据包的到达状态,每个位图空隙元素占2比特,其值为“00”时,表示空隙记录状态为空,对应数据包未到达;其值为“01”时,表示对应数据包已到达;其值为“10”时,表示尾部数据包已到达;所有位图空隙元素的初始值均置为“00”,尾部数据包是指Write Last请求包或ReadLast响应包;The receiving bitmap gap array, including N bitmap gap elements, N≥1, is used to record the arrival status of the data packet arriving at the receiving end, each bitmap gap element occupies 2 bits, and when its value is "00", it means When the gap record status is empty, the corresponding data packet has not arrived; when its value is "01", it means that the corresponding data packet has arrived; when its value is "10", it means that the tail data packet has arrived; the initial value of all bitmap gap elements Both are set to "00", and the tail packet refers to the Write Last request packet or the ReadLast response packet;

该接收位图首部,始终指向该接收队列期待接收序列号eRPSN对应的位图空隙元素。The header of the receiving bitmap always points to the bitmap slot element corresponding to the receiving queue expecting to receive the sequence number eRPSN.

步骤2,构建发送位图链表。Step 2, build a linked list of sending bitmaps.

在每个网络节点的网卡中构建一张发送位图链表,并在该表中为每个处于活动状态的发送队列分配一个位图条目,用以记录发送队列的响应包到达状态。其中:A sending bitmap linked list is constructed in the network card of each network node, and a bitmap entry is allocated to each active sending queue in the table to record the arrival status of the response packets of the sending queue. in:

所述每个网络节点的网卡,都有多个发送端。每个发送端各自维护一个发送队列,用以记录发送的请求包和收到的响应包的信息;The network card of each network node has multiple senders. Each sender maintains a sending queue to record the information of sent request packets and received response packets;

所述每个发送队列,各自维护next_RPSN、MSN_max、MSN_min和SSN_first_read多种变量和一个Read操作信息表:Each of the sending queues maintains various variables of next_RPSN, MSN_max, MSN_min and SSN_first_read and a Read operation information table:

该next_RPSN变量,用于封装请求包首部的序列号字段,且无论当前Read请求包的请求数据长度有多大,该Read请求包后续的任何请求包的RPSN只递增1;The next_RPSN variable is used to encapsulate the serial number field of the header of the request packet, and no matter how large the request data length of the current Read request packet is, the RPSN of any subsequent request packet of the Read request packet is only incremented by 1;

该MSN_max变量,为接收端已累积确认的消息序号,即接收端已释放存储空间的序列号,其中,ACK类型包的MSN一定不超过MSN_max变量值;The MSN_max variable is the sequence number of the message that the receiver has accumulated and confirmed, that is, the sequence number of the storage space that has been released by the receiver, and the MSN of the ACK type packet must not exceed the value of the MSN_max variable;

该MSN_min变量,为发送端当前最大可完成的消息序号,Response类型响应包的MSN一定不小于MSN_min变量值;The MSN_min variable is the current maximum message sequence number that can be completed by the sender, and the MSN of the Response type response packet must not be less than the MSN_min variable value;

该SSN_first_read变量,为发送端已发送但尚未释放存储空间的含有Read请求包消息中,第一个Read消息的消息序号;The SSN_first_read variable is the message sequence number of the first Read message in the message containing the Read request packet that has been sent by the sender but has not yet released the storage space;

该Read操作信息表,用于为每个Read操作的请求包分配一个信息表条目,每个信息表条目包含四项元素,分别为:发送序列号、响应包的起始响应序列号start_APSN、响应包数目和虚拟地址;The Read operation information table is used to allocate an information table entry for the request packet of each Read operation, and each information table entry contains four elements, namely: the sending sequence number, the starting response sequence number start_APSN of the response packet, the response number of packets and virtual addresses;

所述每个发送端位图条目,由发送队列标识、发送位图空隙数组、eR_APSN变量和发送位图首部这四部分组成:Each of the sending end bitmap entries is composed of four parts: the sending queue identifier, the sending bitmap gap array, the eR_APSN variable and the sending bitmap header:

该发送队列标识,用于标记对应位图条目,方便后续操作在发送位图链表中索引该位图条目;The sending queue identifier is used to mark the corresponding bitmap entry, which is convenient for subsequent operations to index the bitmap entry in the sending bitmap linked list;

该发送位图空隙数组,包含N个位图空隙元素,N≥1,用于记录到达发送端的数据包的到达状态,每个位图空隙元素占2比特,其值为“00”时,表示空隙记录状态为空,对应数据包未到达;其值为“01”时,表示对应数据包已到达;其值为“10”时,表示尾部数据包已到达;所有位图空隙元素的初始值均置为“00”,尾部数据包是指Write Last请求包或ReadLast响应包;The sending bitmap gap array, including N bitmap gap elements, N≥1, is used to record the arrival status of the data packets arriving at the sender, each bitmap gap element occupies 2 bits, and when its value is "00", it means When the gap record status is empty, the corresponding data packet has not arrived; when its value is "01", it means that the corresponding data packet has arrived; when its value is "10", it means that the tail data packet has arrived; the initial value of all bitmap gap elements Both are set to "00", and the tail packet refers to the Write Last request packet or the ReadLast response packet;

该eR_APSN变量,用于表示该发送队列期待接收的Read请求的响应包序列号;The eR_APSN variable is used to indicate the response packet sequence number of the Read request that the sending queue expects to receive;

该发送位图首部,始终指向该发送队列期待接收的Read请求的响应包序列号对应的位图空隙元素。The sending bitmap header always points to the bitmap space element corresponding to the sequence number of the response packet of the Read request expected to be received by the sending queue.

步骤3,根据网卡在网络中的位置,执行不同的后续操作。Step 3, according to the position of the network card in the network, perform different subsequent operations.

每张网卡有多个接收端和发送端,每个发送端均向接收端发送多条消息,每条消息由若干请求包组成,发送端发完某条消息的请求包后等待接收端反馈,直至收到反馈的响应包后再根据响应包内容,释放已经被按序接收的消息的存储空间或重发未被按序接收的消息。Each network card has multiple receivers and senders. Each sender sends multiple messages to the receiver. Each message consists of several request packets. After the sender sends the request packet of a message, it waits for the receiver to feedback. After receiving the feedback response packet, release the storage space of the messages that have been received in sequence or retransmit the messages that have not been received in sequence according to the content of the response packet.

对于单张网卡而言,其在网络中发送请求包时,担任接收端的角色,发送响应包时,担任发送端的角色,即其在网络中的位置不同,需要实现的功能也不同,所以需要选择不同的后续操作:For a single network card, when it sends a request packet in the network, it plays the role of the receiver, and when it sends a response packet, it plays the role of the sender. Different follow-up actions:

对接收端网卡,执行步骤4;For the receiving end network card, go tostep 4;

对发送端网卡,执行步骤5。For the sender NIC, go tostep 5.

步骤4,请求包到达接收端网卡时,接收端将累积已执行的响应包数目作为响应包的响应序列号APSN,并执行乱序重排处理。Step 4, when the request packet arrives at the network card of the receiving end, the receiving end takes the accumulated number of executed response packets as the response sequence number APSN of the response packet, and performs out-of-order rearrangement processing.

所述请求包与响应包在序列号上为非一一对应的关系,一个Read操作包含一个Read请求包,如图2中的请求包1,一个Read请求包占用一个请求序列号RPSN;一个Read请求包可能会返回多个Response响应包,如图2中的三个Response包1、2、3,每个Response响应包占用一个响应序列号APSN,即一个请求序列号RPSN可能对应多个响应序列号APSN。The request packet and the response packet are in a non-one-to-one correspondence in sequence numbers, and a Read operation includes a Read request packet, such asrequest packet 1 in Figure 2, a Read request packet occupies a request sequence number RPSN; a Read request packet occupies a request sequence number RPSN; The request packet may return multiple Response response packets, such as the threeResponse packets 1, 2, and 3 in Figure 2. Each Response response packet occupies a response sequence number APSN, that is, a request sequence number RPSN may correspond to multiple response sequences. No. APSN.

参照图3,本步骤具体实现如下:Referring to Figure 3, this step is specifically implemented as follows:

4.1)判断该请求包的请求序列号RPSN是否超出位图条目Ki能记录的范围:4.1) Determine whether the request sequence numberRPSN of the request packet exceeds the range that can be recorded by the bitmap entry Ki:

若是,则直接丢弃该请求包;否则,执行4.2);If so, directly discard the request packet; otherwise, execute 4.2);

4.2)比较该请求包的请求序列号RPSN与接收端接收队列的期待接收序列号eRPSN的大小:4.2) Compare the request sequence number RPSN of the request packet with the size of the expected reception sequence number eRPSN of the receiving end's receiving queue:

若RPSN<eRPSN,则执行4.3);If RPSN<eRPSN, execute 4.3);

若RPSN>eRPSN,则说明该请求包提前到达接收队列,执行4.4);If RPSN>eRPSN, it means that the request packet arrives in the receiving queue in advance, and execute 4.4);

若RPSN=eRPSN,则说明该请求包按序到达接收队列,执行4.10);If RPSN=eRPSN, it means that the request packet arrives in the receiving queue in sequence, and execute 4.10);

4.3)接收队列丢弃该请求包;4.3) The receiving queue discards the request packet;

4.4)比较RPSN、eRPSN两者之差与位图条目Ki中位图空隙数组长度N的大小:4.4) Compare the difference between RPSN andeRPSN and the size of the bitmap gap array length N in the bitmap entry Ki:

若RPSN-eRPSN>N,则执行4.3);If RPSN-eRPSN>N, execute 4.3);

若RPSN-eRPSN≤N,则执行4.5);If RPSN-eRPSN≤N, execute 4.5);

4.5)进一步判断该请求包首部的OpCode字段值:4.5) Further judge the value of the OpCode field in the header of the request packet:

若该请求包首部OpCode字段的值为Infiniband协议中的“Read Request”,则执行4.6)If the value of the OpCode field in the header of the request packet is "Read Request" in the Infiniband protocol, go to 4.6)

若该请求包首部OpCode字段的值为Infiniband协议中的“Write Last”或“WriteOnly”,则执行4.3);If the value of the OpCode field in the header of the request packet is "Write Last" or "WriteOnly" in the Infiniband protocol, execute 4.3);

若该请求包首部OpCode字段的值为Infiniband协议中的“Write First”或“WriteMiddle”,则执行4.7);If the value of the OpCode field in the header of the request packet is "Write First" or "WriteMiddle" in the Infiniband protocol, execute 4.7);

4.6)在网卡处缓存该请求包,并执行4.8);4.6) Cache the request packet at the network card and execute 4.8);

4.7)接收队列执行该请求包的请求内容,并在位图空隙数组中将该请求包对应的位图空隙元素的值更新为“01”,该请求包对应的位图空隙元素与位图首部指向的位图空隙元素之间的距离n为RPSN与该接收队列eRPSN之差,然后执行4.8);4.7) The receiving queue executes the request content of the request packet, and updates the value of the bitmap void element corresponding to the request packet in the bitmap void array to "01", the bitmap void element corresponding to the request packet and the bitmap header The distance n between the pointed bitmap gap elements is the difference between the RPSN and the eRPSN of the receiving queue, and then execute 4.8);

4.8)进一步判断该请求包首部的OpCode字段的值:4.8) Further judge the value of the OpCode field in the header of the request packet:

若该请求包首部OpCode字段的值为Infiniband协议中的“Write First”或“WriteMiddle”,则执行4.9);If the value of the OpCode field in the header of the request packet is "Write First" or "WriteMiddle" in the Infiniband protocol, execute 4.9);

若该请求包首部OpCode字段的值为Infiniband协议中的“Write Only”或“WriteLast”,则执行4.10);If the value of the OpCode field in the header of the request packet is "Write Only" or "WriteLast" in the Infiniband protocol, execute 4.10);

若该请求包首部OpCode字段的值为Infiniband协议中的“Read Request”,则执行4.11);If the value of the OpCode field in the header of the request packet is "Read Request" in the Infiniband protocol, execute 4.11);

4.9)将该接收队列的eRPSN及next_APSN变量均递增1,将位图首部指向后一个位图空隙元素,执行4.12)对位图首部进行更新;4.9) both eRPSN and next_APSN variables of the receiving queue are incremented by 1, the bitmap header is pointed to the next bitmap gap element, and 4.12) is performed to update the bitmap header;

4.10)将该接收队列的eRPSN、next_APSN变量和MSN变量均递增1,将位图首部指向后一个位图空隙元素,执行4.12)对位图首部进行更新;4.10) increase the eRPSN, next_APSN variable and MSN variable of this receiving queue by 1, point the bitmap header to the next bitmap gap element, and execute 4.12) to update the bitmap header;

4.11)将该接收队列的eRPSN及MSN变量均递增1,将该接收队列的APSN变量递增L,将位图首部指向后一个位图空隙元素,执行4.12)对位图首部进行更新,其中L为该请求包首部的DMA Length字段值与网络最大传输单元MTU值的商;4.11) Increase the eRPSN and MSN variables of the receiving queue by 1, increase the APSN variable of the receiving queue by L, point the bitmap header to the next bitmap gap element, and execute 4.12) Update the bitmap header, where L is The quotient of the value of the DMA Length field in the header of the request packet and the MTU value of the network maximum transmission unit;

4.12)判断位图首部当前指向位图空隙元素的值:4.12) Determine the value of the bitmap header currently pointing to the bitmap gap element:

若为“10”,则执行4.13);If it is "10", execute 4.13);

若为“01”,则执行4.14);If it is "01", execute 4.14);

若为“00”,则执行4.15);If it is "00", execute 4.15);

4.13)将该接收队列的eRPSN、next_APSN变量、MSN变量均递增1,重置当前位图空隙元素的值,并将位图首部指向后一个位图空隙元素,返回4.12);4.13) Increment the eRPSN, next_APSN variable and MSN variable of the receiving queue by 1, reset the value of the current bitmap gap element, point the bitmap header to the next bitmap gap element, and return to 4.12);

4.14)将该接收队列的eRPSN、next_APSN变量递增1,重置当前位图空隙元素的值,并位图首部指向后一个位图空隙元素,返回4.12);4.14) Increment the eRPSN and next_APSN variables of the receiving queue by 1, reset the value of the current bitmap gap element, and point the bitmap header to the next bitmap gap element, and return to 4.12);

4.15)在网卡缓存空间中查找特定的Read请求包,该特定的Read请求包,是指响应序列号RPSN与该接收队列eRPSN相同的Read请求包:4.15) Find a specific Read request packet in the network card cache space. The specific Read request packet refers to the Read request packet whose response sequence number RPSN is the same as the eRPSN of the receiving queue:

若未查找到该特定的Read请求包,则退出位图首部更新子流程,等待下一个触发条件;If the specific Read request packet is not found, exit the bitmap header update sub-process and wait for the next trigger condition;

若查找到该特定的Read请求包,则执行4.16);If the specific Read request packet is found, execute 4.16);

4.16)将该接收队列的eRPSN及MSN变量均递增1,将该接收队列的APSN变量递增L,将位图首部指向后一个位图空隙元素,返回4.12),其中L为该请求包首部的DMA Length字段值与网络最大传输单元MTU值的商。4.16) Increment the eRPSN and MSN variables of the receiving queue by 1, increment the APSN variable of the receiving queue by L, point the bitmap header to the next bitmap gap element, and return to 4.12), where L is the DMA of the request packet header The quotient of the value of the Length field and the MTU value of the network maximum transmission unit.

步骤5,响应包到达发送端网卡时,发送端将累积已发出的请求包数目作为请求包的请求序列号RPSN,并执行乱序重排处理。Step 5: When the response packet arrives at the network card of the sender, the sender takes the accumulated number of sent request packets as the request sequence number RPSN of the request packet, and performs out-of-order rearrangement processing.

参照图4,本步骤具体流程如下:4, the specific flow of this step is as follows:

5.1)发送端依据发送队列标识信息在发送位图链表中索引该发送队列对应的位图条目。5.1) The sender indexes the bitmap entry corresponding to the sending queue in the sending bitmap linked list according to the sending queue identification information.

5.2)判断该响应包首部的OpCode字段的值:5.2) Determine the value of the OpCode field in the header of the response packet:

若为“Acknowledge”,说明该响应包为ACK类型包,执行5.3);If it is "Acknowledge", it means that the response packet is an ACK type packet, and execute 5.3);

若为“Read Response”,说明该响应包是Response类型包,执行5.7);If it is "Read Response", it means that the response packet is a Response type packet, and execute 5.7);

5.3)将该响应包首部的MSN字段值与该发送队列MSN_max变量的值进行比较:5.3) Compare the MSN field value of the response packet header with the value of the send queue MSN_max variable:

若MSN≤MSN_max,则丢弃该响应包;If MSN≤MSN_max, discard the response packet;

若MSN>MSN_max,则执行5.4);If MSN>MSN_max, execute 5.4);

5.4)发送队列将MSN_max变量的值更新为该响应包首部MSN字段的值,并在Read操作信息表中检测是否存在特定的信息表条目,该特定的信息表条目,是指发送序列号元素的值不超过MSN_max变量的值的信息表条目:5.4) The sending queue updates the value of the MSN_max variable to the value of the MSN field in the header of the response packet, and detects whether there is a specific information table entry in the Read operation information table. The specific information table entry refers to the sending sequence number element. Info table entries whose value does not exceed the value of the MSN_max variable:

若不存在,则执行5.5);If it does not exist, execute 5.5);

若存在,则执行5.6);If it exists, execute 5.6);

5.5)发送队列将MSN_min变量的值设置为MSN_max变量的值,然后发送队列按照MSN_min变量的值释放发送端已发送的消息,即按照MSN_min变量的值完成工作队列元素WQE;5.5) The sending queue sets the value of the MSN_min variable to the value of the MSN_max variable, and then the sending queue releases the message sent by the sender according to the value of the MSN_min variable, that is, completes the work queue element WQE according to the value of the MSN_min variable;

5.6)将SSN_first_read变量值减小1,发送队列将MSN_min变量的值设置为减小后的SSN_first_read值,再按照MSN_min变量的值释放发送端已发送的消息,即按照MSN_min变量的值完成工作队列元素WQE;5.6) Decrease the value of the SSN_first_read variable by 1, set the value of the MSN_min variable to the reduced SSN_first_read value in the sending queue, and then release the message sent by the sender according to the value of the MSN_min variable, that is, complete the work queue element according to the value of the MSN_min variable WQE;

5.7)将响应包首部的MSN字段的值与该发送队列MSN_max变量的值进行比较:5.7) Compare the value of the MSN field of the response packet header with the value of the send queue MSN_max variable:

若MSN>MSN_max,则发送队列将MSN_max变量的值更新为该响应包首部MSN字段的值,然后执行5.8);If MSN>MSN_max, the sending queue updates the value of the MSN_max variable to the value of the MSN field in the header of the response packet, and then executes 5.8);

若MSN≤MSN_max,则执行5.8);If MSN≤MSN_max, execute 5.8);

5.8)发送队列执行该响应包的操作内容,并在位图空隙数组中更新该响应包对应的位图空隙元素的值,再判断该响应包的请求序列号APSN与该发送队列位图条目中的eR_APSN变量的值是否相同:5.8) The sending queue executes the operation content of the response packet, updates the value of the bitmap slot element corresponding to the response packet in the bitmap slot array, and then determines the request sequence number APSN of the response packet and the bitmap entry of the sending queue. Is the value of the eR_APSN variable the same:

若APSN=eR_APSN,则说明该响应包按序到达发送端,执行5.9)对位图首部进行更新;If APSN=eR_APSN, it means that the response packet arrives at the sender in sequence, and execute 5.9) to update the bitmap header;

若APSN≠eR_APSN,则说明该响应包乱序到达发送端,返回5.6);If APSN≠eR_APSN, it means that the response packet arrives at the sender out of sequence, and returns 5.6);

5.9)判断位图首部当前指向的位图空隙元素的值:5.9) Determine the value of the bitmap gap element currently pointed to by the bitmap header:

若为“00”,则退出位图首部更新子流程,等待下一个触发条件;If it is "00", exit the bitmap header update sub-process and wait for the next trigger condition;

若为“01”,则将位图首部指向后一个位图空隙元素,并返回5.9);If it is "01", point the bitmap header to the next bitmap gap element, and return 5.9);

若为“10”,则执行5.10);If it is "10", execute 5.10);

5.10)将发送队列的MSN_min变量的值增加1,并进一步判断Read操作信息表中是否还存在未完成的Read操作条目:5.10) Increase the value of the MSN_min variable of the sending queue by 1, and further judge whether there are still unfinished Read operation entries in the Read operation information table:

若存在,则发送队列将eR_APSN变量的值设置为该条目中start_APSN元素的值,同时位图首部指向后一个位图空隙元素,返回5.9);If it exists, the sending queue sets the value of the eR_APSN variable to the value of the start_APSN element in the entry, and the bitmap header points to the next bitmap gap element, and returns 5.9);

若不存在,则发送队列将MSN_min变量的值设置为MSN_max变量的值,并将位图首部指向后一个位图空隙元素,返回5.9)。If it does not exist, the send queue sets the value of the MSN_min variable to the value of the MSN_max variable, and points the bitmap header to the next bitmap gap element, and returns 5.9).

以上描述仅是本发明的一个具体实例,并未构成对本发明的任何限制,显然对于本领域的专业人员来说,在了解了本发明内容和原理后,都可能在不背离本发明原理、结构的情况下,进行形式和细节上的各种修改和改变,但是这些基于本发明思想的修正和改变仍然在本发明的权利要求保护范围之内。The above description is only a specific example of the present invention, and does not constitute any limitation to the present invention. Obviously, for those skilled in the art, after understanding the content and principles of the present invention, they may not deviate from the principles and structures of the present invention. Under the circumstance of the present invention, various modifications and changes in form and details are made, but these modifications and changes based on the idea of the present invention still fall within the protection scope of the claims of the present invention.

Claims (10)

Translated fromChinese
1.一种多路径路由场景下的接收端乱序重排方法,其特征在于,包括如下:1. a receiving end out-of-order rearrangement method under a multi-path routing scenario, is characterized in that, comprises as follows:(1)在每个网络节点的网卡中构建一张接收位图链表,并在该表中为每个处于活动状态的接收队列分配一个位图条目,用以记录接收队列的请求包到达状态;(1) Construct a receiving bitmap linked list in the network card of each network node, and allocate a bitmap entry for each active receiving queue in the table to record the request packet arrival status of the receiving queue;(2)在每个网络节点的网卡中构建一张发送位图链表,并在该表中为每个处于活动状态的发送队列分配一个位图条目,用以记录发送队列的响应包到达状态;(2) Construct a sending bitmap linked list in the network card of each network node, and allocate a bitmap entry for each active sending queue in the table to record the arrival state of the response packet of the sending queue;(3)根据网卡在网络中的位置,对网卡中的每个发送端和每个响应端分别执行不同的的操作:(3) According to the position of the network card in the network, perform different operations on each sender and each responder in the network card:对接收端,执行(4);For the receiving end, execute (4);对发送端,执行(5);For the sender, execute (5);(4)在请求包到达接收端的网卡时,接收端将累积已执行的响应包数目作为响应包的响应序列号APSN,并将接收队列的期待接收序列号eRPSN与请求包请求序列号RPSN进行比较,根据比较结果对请求包进行缓存或执行或丢弃操作;(4) When the request packet reaches the network card of the receiving end, the receiving end will accumulate the number of executed response packets as the response sequence number APSN of the response packet, and compare the expected reception sequence number eRPSN of the receiving queue with the request packet request sequence number RPSN. , cache or execute or discard the request packet according to the comparison result;(5)在响应包到达发送端的网卡时,发送端将累积已发出的请求包数目作为请求包的请求序列号RPSN,并根据响应包类型、响应包的MSN消息号和发送队列对应的位图条目,决定对该响应包进行二选一操作,即丢弃,或按照发送端可完成的MSN消息号编辑工作队列元素WQE。(5) When the response packet reaches the network card of the sender, the sender will accumulate the number of sent request packets as the request sequence number RPSN of the request packet, and according to the type of the response packet, the MSN message number of the response packet and the corresponding bitmap of the sending queue entry, it is decided to perform a two-choice operation on the response packet, that is, discard it, or edit the work queue element WQE according to the MSN message number that can be completed by the sender.2.根据权利要求1所述的方法,其中(2)中的接收队列:2. The method according to claim 1, wherein the receiving queue in (2):每个网卡有多个接收端,每个接收端维护一个接收队列,用以记录接收到的请求包的相关信息;Each network card has multiple receivers, and each receiver maintains a receiving queue to record the relevant information of the received request packets;每个接收队列各自维护next_APSN和MSN多个变量:Each receive queue maintains multiple variables of next_APSN and MSN:所述next_APSN变量,用于封装响应包的序列号字段;The next_APSN variable is used to encapsulate the serial number field of the response packet;该MSN变量,在响应包从接收端发出时,接收端将其赋值给响应包首部的MSN字段,以将接收端的MSN变量信息通过响应包传递至发送端,该MSN变量的值也作为响应包MSN消息号使用。The MSN variable, when the response packet is sent from the receiving end, the receiving end assigns it to the MSN field of the response packet header, so as to transmit the MSN variable information of the receiving end to the sending end through the response packet, and the value of the MSN variable is also used as the response packet. MSN message number to use.3.根据权利要求1所述的方法,其中(2)中的每个接收端位图条目,由接收队列标识、接收位图空隙数组和接收位图首部这三部分组成,其中:3. method according to claim 1, wherein each receiving end bitmap entry in (2), is made up of these three parts of receiving queue identification, receiving bitmap gap array and receiving bitmap header, wherein:所述接收队列标识,用于标记对应位图条目,方便后续操作在接收位图链表中索引该位图条目;The receiving queue identifier is used to mark the corresponding bitmap entry, which is convenient for subsequent operations to index the bitmap entry in the receiving bitmap linked list;所述接收位图空隙数组,包含N个位图空隙元素,N≥1,用于记录到达接收端的数据包的到达状态,每个位图空隙元素占2比特,其值为“00”时,表示空隙记录状态为空,对应数据包未到达;其值为“01”时,表示对应数据包已到达;其值为“10”时,表示尾部数据包已到达;所有位图空隙元素的初始值均置为“00”,尾部数据包是指Write Last请求包或Read Last响应包;The receiving bitmap gap array, including N bitmap gap elements, N≥1, is used to record the arrival state of the data packet arriving at the receiving end, each bitmap gap element occupies 2 bits, and when its value is "00", Indicates that the gap record status is empty and the corresponding data packet has not arrived; when its value is "01", it indicates that the corresponding data packet has arrived; when its value is "10", it indicates that the tail data packet has arrived; the initial value of all bitmap gap elements All values are set to "00", and the tail data packet refers to the Write Last request packet or the Read Last response packet;所述接收位图首部,始终指向该接收队列期待接收序列号eRPSN对应的位图空隙元素。The receiving bitmap header always points to the bitmap slot element corresponding to the receiving queue expecting to receive the sequence number eRPSN.4.根据权利要求1所述的方法,其中(3)中的每个发送队列,各自维护next_RPSN、MSN_max、MSN_min和SSN_first_read多种变量和一个Read操作信息表:4. method according to claim 1, wherein each sending queue in (3), maintains next_RPSN, MSN_max, MSN_min and SSN_first_read multiple variables and a Read operation information table respectively:所述next_RPSN变量,用于封装请求包首部的序列号字段,且无论当前Read请求包的请求数据长度有多大,该Read请求包后续的任何请求包的RPSN只递增1;The next_RPSN variable is used to encapsulate the serial number field of the request packet header, and no matter how large the request data length of the current Read request packet is, the RPSN of any subsequent request packet of the Read request packet is only incremented by 1;所述MSN_max变量,为接收端已累积确认的消息序号,即接收端已释放的序列号,其中,ACK类型响应包的MSN一定不超过MSN_max变量值;The MSN_max variable is the message sequence number that the receiving end has accumulated and confirmed, that is, the sequence number that the receiving end has released, wherein the MSN of the ACK type response packet must not exceed the MSN_max variable value;所述MSN_min变量,为发送端当前最大可完成的消息序号,Response类型响应包的MSN一定不小于MSN_min变量值;The MSN_min variable is the current maximum achievable message sequence number of the sender, and the MSN of the Response type response packet must not be less than the MSN_min variable value;所述SSN_first_read变量,为发送端已发送但尚未释放存储空间的含Read请求包的消息中,第一个Read消息的消息序号;The SSN_first_read variable is the message sequence number of the first Read message in the message containing the Read request package that has been sent by the sender but has not yet released the storage space;所述Read操作信息表,用于为每个Read操作的请求包分配一个信息表条目,每个信息表条目包含四项元素,分别为:发送序列号、响应包的起始响应序列号start_APSN、响应包数目和虚拟地址。The Read operation information table is used for allocating an information table entry for the request packet of each Read operation, and each information table entry contains four elements, which are respectively: the sending sequence number, the initial response sequence number start_APSN of the response packet, Number of response packets and virtual addresses.5.根据权利要求1所述的方法,其中(3)中的每个发送端位图条目,由发送队列标识、发送位图空隙数组、eR_APSN变量和发送位图首部这四部分组成:5. method according to claim 1, wherein each sending end bitmap entry in (3), is made up of these four parts of sending queue identification, sending bitmap gap array, eR_APSN variable and sending bitmap header:所述发送队列标识,用于标记对应位图条目,方便后续操作在发送位图链表中索引该位图条目;The sending queue identifier is used to mark the corresponding bitmap entry, which is convenient for subsequent operations to index the bitmap entry in the sending bitmap linked list;所述发送位图空隙数组,包含N个位图空隙元素,N≥1,用于记录到达发送端的数据包的到达状态,每个位图空隙元素占2比特,其值为“00”时,表示空隙记录状态为空,对应数据包未到达;其值为“01”时,表示对应数据包已到达;其值为“10”时,表示尾部数据包已到达;所有位图空隙元素的初始值均置为“00”,尾部数据包是指Write Last请求包或ReadLast响应包;The sending bitmap gap array, including N bitmap gap elements, N≥1, is used to record the arrival state of the data packet arriving at the sending end, each bitmap gap element occupies 2 bits, and when its value is "00", Indicates that the gap record status is empty and the corresponding data packet has not arrived; when its value is "01", it indicates that the corresponding data packet has arrived; when its value is "10", it indicates that the tail data packet has arrived; the initial value of all bitmap gap elements All values are set to "00", and the trailing packet refers to the Write Last request packet or the ReadLast response packet;所述eR_APSN变量,用于表示该发送队列期待接收的Read请求的响应包序列号;The eR_APSN variable is used to represent the response packet sequence number of the Read request that the sending queue expects to receive;所述发送位图首部,始终指向该发送队列期待接收的Read请求的响应包序列号对应的位图空隙元素。The sending bitmap header always points to the bitmap space element corresponding to the sequence number of the response packet of the Read request expected to be received by the sending queue.6.根据权利要求1所述的方法,其中(4)中所述的请求包,包括Read请求包和Write请求包,其中:6. The method according to claim 1, wherein the request package described in (4) comprises a Read request package and a Write request package, wherein:Read请求包根据其首部OpCode字段的值进一步细分为ReadFirst请求包、Read Middle请求包、ReadOnly请求包、ReadLast请求包多种类型;Read request packets are further subdivided into ReadFirst request packets, Read Middle request packets, ReadOnly request packets, and ReadLast request packets according to the value of the OpCode field in the header;Write请求包根据其首部OpCode字段的值进一步细分为Write First请求包、WriteMiddle请求包、Write Only请求包和Write Last请求包多种类型。Write request packets are further subdivided into Write First request packets, WriteMiddle request packets, Write Only request packets and Write Last request packets according to the value of the OpCode field in the header.7.根据权利要求1所述的方法,其中(4)中的根据接收端接收队列的期待接收序列号eRPSN与请求包的请求序列号RPSN比较的结果,对请求包进行不同的操作,实现如下:7. method according to claim 1, wherein in (4) according to the result of the comparison of the request sequence number eRPSN of the receiving end receiving queue and the request sequence number RPSN of the request package, different operations are carried out to the request package, and are realized as follows :(4a)判断该请求包的请求序列号RPSN是否超出位图条目Ki能记录的范围:(4a) Determine whether the request sequence numberRPSN of the request packet exceeds the range that can be recorded by the bitmap entry Ki:若是,则直接丢弃该请求包;否则,执行(4b);If so, directly discard the request packet; otherwise, execute (4b);(4b)判断该请求包的请求序列号RPSN与接收端接收队列的期待接收序列号eRPSN大小:(4b) Determine the size of the request sequence number RPSN of the request packet and the expected reception sequence number eRPSN of the receiving queue at the receiving end:若RPSN<eRPSN,则执行(4c);If RPSN<eRPSN, execute (4c);若RPSN>eRPSN,则说明该请求包提前到达接收队列,执行(4d);If RPSN>eRPSN, it means that the request packet arrives in the receiving queue in advance, and execute (4d);若RPSN=eRPSN,则说明该请求包按序到达接收队列,执行(4j);If RPSN=eRPSN, it means that the request packet arrives in the receiving queue in sequence, and execute (4j);(4c)接收队列丢弃该请求包;(4c) The receiving queue discards the request packet;(4d)判断RPSN、eRPSN两者之差与位图条目Ki中位图空隙数组长度N的大小:(4d) Determine the difference between the RPSN andeRPSN and the size of the bitmap gap array length N in the bitmap entry Ki:若RPSN-eRPSN>N,则执行(4c);If RPSN-eRPSN>N, execute (4c);若RPSN-eRPSN≤N,则执行(4e);If RPSN-eRPSN≤N, execute (4e);(4e)进一步判断该请求包首部的OpCode字段值:(4e) Further judge the value of the OpCode field in the header of the request packet:若该请求包首部OpCode字段的值为“ReadRequest”,则执行(4f);If the value of the OpCode field in the header of the request packet is "ReadRequest", execute (4f);若该请求包首部OpCode字段的值为“Write Last”或“Write Only”,则执行(4c);If the value of the OpCode field in the header of the request packet is "Write Last" or "Write Only", execute (4c);若该请求包首部OpCode字段的值为“Write First”或“Write Middle”,则执行(4g);If the value of the OpCode field in the header of the request packet is "Write First" or "Write Middle", execute (4g);(4f)在网卡处缓存该请求包,并执行(4h);(4f) Cache the request packet at the network card and execute (4h);(4g)接收队列执行该请求包的请求内容,并在位图空隙数组中将该请求包对应的位图空隙元素的值更新为“01”,该请求包对应的位图空隙元素与位图首部指向的位图空隙元素之间的距离n为RPSN与该接收队列eRPSN之差,然后执行(4h);(4g) The receiving queue executes the request content of the request packet, and updates the value of the bitmap void element corresponding to the request packet to "01" in the bitmap void array, and the bitmap void element corresponding to the request packet and the bitmap The distance n between the bitmap gap elements pointed to by the header is the difference between the RPSN and the eRPSN of the receiving queue, and then execute (4h);(4h)进一步判断该请求包首部的OpCode字段的值:(4h) Further judge the value of the OpCode field in the header of the request packet:若该请求包首部OpCode字段的值为“Write First”或“Write Middle”,则执行(4l);If the value of the OpCode field in the header of the request packet is "Write First" or "Write Middle", execute (41);若该请求包首部OpCode字段的值为“Write Only”或“Write Last”,则执行(4j);If the value of the OpCode field in the header of the request packet is "Write Only" or "Write Last", execute (4j);若该请求包首部OpCode字段的值为“ReadRequest”,则执行(4k);If the value of the OpCode field in the header of the request packet is "ReadRequest", execute (4k);(4i)将该接收队列的eRPSN及next_APSN变量均递增1,将位图首部指向后一个位图空隙元素,执行(4l);(4i) both eRPSN and next_APSN variables of the receiving queue are incremented by 1, and the bitmap header is pointed to the next bitmap gap element, and (41) is executed;(4j)将该接收队列的eRPSN、next_APSN变量和MSN变量均递增1,将位图首部指向后一个位图空隙元素,执行(4l);(4j) increment the eRPSN, next_APSN variable and MSN variable of the receiving queue by 1, point the bitmap header to the next bitmap space element, and execute (41);(4k)将该接收队列的eRPSN及MSN变量均递增1,将该接收队列的APSN变量递增L,将位图首部指向后一个位图空隙元素,执行(4l),其中L为该请求包首部的DMA Length字段值与网络最大传输单元MTU值的商;(4k) Increment the eRPSN and MSN variables of the receiving queue by 1, increment the APSN variable of the receiving queue by L, point the bitmap header to the next bitmap space element, and execute (41), where L is the request packet header The quotient of the DMA Length field value and the network maximum transmission unit MTU value;(4l)对位图首部进行更新。(4l) Update the bitmap header.8.根据权利要求7中所述的方法,其中(4l)中对位图首部进行更新,实现如下:8. according to the method described in claim 7, wherein in (41), bitmap header is updated, realizes as follows:(4l1)判断位图首部当前指向的位图空隙元素的值:(4l1) Determine the value of the bitmap gap element currently pointed to by the bitmap header:若为“10”,则执行(4l2);If it is "10", execute (4l2);若为“01”,则执行(4l3);If it is "01", execute (4l3);若为“00”,则执行(4l4);If it is "00", execute (4l4);(4l2)将该接收队列的eRPSN、next_APSN变量、MSN变量均递增1,重置当前位图空隙元素的值,并将位图首部指向后一个位图空隙元素,返回(4l1);(412) The eRPSN, next_APSN variable and MSN variable of the receiving queue are all incremented by 1, the value of the current bitmap gap element is reset, and the bitmap header is pointed to the next bitmap gap element, and returns (411);(4l3)将该接收队列的eRPSN、next_APSN变量递增1,重置当前位图空隙元素的值,并位图首部指向后一个位图空隙元素,返回(4l1);(413) Increment the eRPSN and next_APSN variables of the receiving queue by 1, reset the value of the current bitmap gap element, and point the bitmap header to the next bitmap gap element, and return (411);(4l4)在网卡缓存空间中查找特定的Read请求包,该特定的Read请求包,是指响应序列号RPSN与该接收队列eRPSN相同的Read请求包:(414) Find a specific Read request packet in the network card cache space. The specific Read request packet refers to the Read request packet whose response sequence number RPSN is the same as that of the receiving queue eRPSN:若未查找到该特定的Read请求包,则退出位图首部更新流程,等待下一个触发条件;If the specific Read request packet is not found, exit the bitmap header update process and wait for the next trigger condition;若查找到该特定的Read请求包,则执行(4l5);If the specific Read request packet is found, execute (4l5);(4l5)将该接收队列的eRPSN及MSN变量均递增1,将该接收队列的APSN变量递增L,将位图首部指向后一个位图空隙元素,返回(4l1),其中L为该请求包首部的DMA Length字段值与网络最大传输单元MTU值的商。(415) Increase the eRPSN and MSN variables of the receiving queue by 1, increment the APSN variable of the receiving queue by L, point the bitmap header to the next bitmap gap element, and return (411), where L is the request packet header The quotient of the value of the DMA Length field and the MTU value of the network maximum transmission unit.9.根据权利要求1所述的方法,其中(5)中根据响应包类型、响应包的消息号MSN和发送队列对应的位图条目,决定对该响应包进行二选一操作,实现如下:9. method according to claim 1, wherein according to the corresponding bitmap entry of response packet type, the message number MSN of response packet and sending queue in (5), it is decided to carry out two-choice operation to this response packet, and realizes as follows:(5a)当响应包到达发送端网卡后,发送端依据发送队列标识信息在发送位图链表中索引该发送队列对应的位图条目;(5a) After the response packet arrives at the sending end network card, the sending end indexes the bitmap entry corresponding to the sending queue in the sending bitmap linked list according to the sending queue identification information;(5b)判断该响应包首部的OpCode字段的值:(5b) Determine the value of the OpCode field in the header of the response packet:若为“Acknowledge”,说明该响应包为ACK类型包,执行(5c);If it is "Acknowledge", it means that the response packet is an ACK type packet, and execute (5c);若为“ReadResponse”,说明该响应包是Response类型包,执行(5g);If it is "ReadResponse", indicating that the response packet is a Response type packet, execute (5g);(5c)将该响应包首部的MSN字段值与该发送队列MSN_max变量的值进行比较:(5c) Compare the MSN field value of the response packet header with the value of the send queue MSN_max variable:若MSN≤MSN_max,则丢弃该响应包;If MSN≤MSN_max, discard the response packet;若MSN>MSN_max,则执行(5d);If MSN>MSN_max, execute (5d);(5d)发送队列将MSN_max变量的值更新为该响应包首部MSN字段的值,并在Read操作信息表中检测是否存在特定的信息表条目,该特定的信息表条目,是指发送序列号元素的值不超过MSN_max变量的值的信息表条目:(5d) The sending queue updates the value of the MSN_max variable to the value of the MSN field in the header of the response packet, and detects whether there is a specific information table entry in the Read operation information table. The specific information table entry refers to the sending sequence number element. Information table entries whose value does not exceed the value of the MSN_max variable:若不存在,则执行(5e);If it does not exist, execute (5e);若存在,则执行(5f);If it exists, execute (5f);(5e)发送队列将MSN_min变量的值设置为MSN_max变量的值,然后发送队列按照MSN_min变量的值释放发送端已发送的消息,即按照MSN_min变量的值完成工作队列元素WQE;(5e) The sending queue sets the value of the MSN_min variable to the value of the MSN_max variable, and then the sending queue releases the message sent by the sender according to the value of the MSN_min variable, that is, completes the work queue element WQE according to the value of the MSN_min variable;(5f)将SSN_first_read变量值减小1,发送队列将MSN_min变量的值设置为减小后的SSN_first_read值,再按照MSN_min变量的值释放发送端已发送的消息,即按照MSN_min变量的值完成工作队列元素WQE;(5f) Decrease the value of the SSN_first_read variable by 1, set the value of the MSN_min variable to the reduced SSN_first_read value in the sending queue, and then release the message sent by the sender according to the value of the MSN_min variable, that is, complete the work queue according to the value of the MSN_min variable element-WQE;(5g)将响应包首部的MSN字段的值与该发送队列MSN_max变量的值进行比较:(5g) Compare the value of the MSN field of the response packet header with the value of the MSN_max variable of the send queue:若MSN>MSN_max,则发送队列将MSN_max变量的值更新为该响应包首部MSN字段的值,然后执行(5h);If MSN>MSN_max, the sending queue updates the value of the MSN_max variable to the value of the MSN field in the header of the response packet, and then executes (5h);若MSN≤MSN_max,则执行(5h);If MSN≤MSN_max, execute (5h);(5h)发送队列执行该响应包的操作内容,并在位图空隙数组中更新该响应包对应的位图空隙元素的值,再判断该响应包的请求序列号APSN与该发送队列位图条目中的eR_APSN变量的值是否相同:(5h) The sending queue executes the operation content of the response packet, updates the value of the bitmap slot element corresponding to the response packet in the bitmap slot array, and then determines the request sequence number APSN of the response packet and the bitmap entry of the sending queue Is the value of the eR_APSN variable in the same:若APSN=eR_APSN,则说明该响应包按序到达发送端,执行(5i);If APSN=eR_APSN, it means that the response packet arrives at the sender in sequence, and execute (5i);若APSN≠eR_APSN,则说明该响应包乱序到达发送端,返回(5f);If APSN≠eR_APSN, it means that the response packet arrives at the sender out of sequence, and returns (5f);(5i)对位图首部进行更新。(5i) Update the bitmap header.10.根据权利要求9所述的方法,其中(5i)中的对位图首部进行更新,实现如下:10. The method according to claim 9, wherein the bitmap header in (5i) is updated, and is implemented as follows:(5i1)判断位图首部当前指向的位图空隙元素的值:(5i1) Determine the value of the bitmap gap element currently pointed to by the bitmap header:若为“00”,则说明当前空隙状态为“空”,并退出位图首部更新流程,等待下一个触发条件;If it is "00", it means that the current gap status is "empty", and exits the bitmap header update process, waiting for the next trigger condition;若为“01”,则说明当前空隙状态为“已达”,并将位图首部指向后一个位图空隙元素,并返回(5i1);If it is "01", it means that the current gap status is "reached", and the bitmap header points to the next bitmap gap element, and returns (5i1);若为“10”,则说明当前空隙状态为“尾部已达”,执行(5i2);If it is "10", it means that the current gap status is "tail reached", and execute (5i2);(5i2)将发送队列的MSN_min变量的值增加1,并进一步判断Read操作信息表中是否还存在未完成的Read操作条目:(5i2) Increase the value of the MSN_min variable of the sending queue by 1, and further judge whether there are still unfinished Read operation entries in the Read operation information table:若存在,则发送队列将eR_APSN变量的值设置为该条目中start_APSN元素的值,同时位图首部指向后一个位图空隙元素,返回(5i1);If it exists, the sending queue sets the value of the eR_APSN variable to the value of the start_APSN element in the entry, and the bitmap header points to the next bitmap gap element, and returns (5i1);若不存在,则发送队列将MSN_min变量的值设置为MSN_max变量的值,并将位图首部指向后一个位图空隙元素,返回(5i1)。If it does not exist, the send queue sets the value of the MSN_min variable to the value of the MSN_max variable, points the bitmap header to the next bitmap gap element, and returns (5i1).
CN202010629313.9A2020-07-032020-07-03 Receiver out-of-order rearrangement method in multi-path routing scenarioActiveCN111711566B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010629313.9ACN111711566B (en)2020-07-032020-07-03 Receiver out-of-order rearrangement method in multi-path routing scenario

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010629313.9ACN111711566B (en)2020-07-032020-07-03 Receiver out-of-order rearrangement method in multi-path routing scenario

Publications (2)

Publication NumberPublication Date
CN111711566Atrue CN111711566A (en)2020-09-25
CN111711566B CN111711566B (en)2021-07-27

Family

ID=72546399

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010629313.9AActiveCN111711566B (en)2020-07-032020-07-03 Receiver out-of-order rearrangement method in multi-path routing scenario

Country Status (1)

CountryLink
CN (1)CN111711566B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113300818A (en)*2021-02-082021-08-24阿里巴巴集团控股有限公司Data transmission system and method
CN114090484A (en)*2021-11-152022-02-25深圳云豹智能有限公司Remote direct data access method and device
CN114817082A (en)*2022-03-272022-07-29西安电子科技大学Out-of-order reassembly and accurate retransmission request method, system and terminal for large packet data
CN114866343A (en)*2022-07-042022-08-05支付宝(杭州)信息技术有限公司Data processing method and device
CN115914126A (en)*2021-09-082023-04-04联发科技(新加坡)私人有限公司 Method for managing out-of-sequence data packets and user equipment thereof
CN116708280A (en)*2023-08-082023-09-05合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室)Data center network multipath transmission method based on disorder tolerance
CN116915370A (en)*2023-09-142023-10-20珠海星云智联科技有限公司Data retransmission method, device and system based on remote direct data access
CN117176809A (en)*2023-09-012023-12-05中科驭数(北京)科技有限公司Data interaction method and system
WO2024022243A1 (en)*2022-07-262024-02-01中兴通讯股份有限公司Data transmission method, network device, computer device, and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5555266A (en)*1993-10-041996-09-10Motorola, Inc.Method for reducing transmission delays in a packet transmission system
EP0785698B1 (en)*1996-01-162005-07-13AT&T Corp.Buffering of multicast cells in switching networks
CN101977100A (en)*2005-06-292011-02-16英特尔公司Block acknowledgement using a scoreboard of temporary records
CN104935413A (en)*2014-03-192015-09-23夏普株式会社 Packet Data Convergence Protocol PDCP entity and its execution method
US20150293793A1 (en)*2014-04-092015-10-15Samsung Electronics Co., Ltd.Method and apparatus for providing a preemptive task scheduling scheme in a real time operating system
CN107395639A (en)*2017-08-292017-11-24天津艾科仪科技有限公司Intelligence obtains the method and system of video data in network
CN109714649A (en)*2017-10-262019-05-03北京航天长峰科技工业集团有限公司The method of RTP-OVER-UDP packet loss and the treatment mechanism that reorders
CN111030927A (en)*2019-11-202020-04-17中国人民解放军国防科技大学Network-on-chip routing method and network router with sequential perception
CN112165457A (en)*2020-09-042021-01-01苏州浪潮智能科技有限公司Method, system and device for file rearrangement and readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5555266A (en)*1993-10-041996-09-10Motorola, Inc.Method for reducing transmission delays in a packet transmission system
EP0785698B1 (en)*1996-01-162005-07-13AT&T Corp.Buffering of multicast cells in switching networks
CN101977100A (en)*2005-06-292011-02-16英特尔公司Block acknowledgement using a scoreboard of temporary records
CN104935413A (en)*2014-03-192015-09-23夏普株式会社 Packet Data Convergence Protocol PDCP entity and its execution method
US20150293793A1 (en)*2014-04-092015-10-15Samsung Electronics Co., Ltd.Method and apparatus for providing a preemptive task scheduling scheme in a real time operating system
CN107395639A (en)*2017-08-292017-11-24天津艾科仪科技有限公司Intelligence obtains the method and system of video data in network
CN109714649A (en)*2017-10-262019-05-03北京航天长峰科技工业集团有限公司The method of RTP-OVER-UDP packet loss and the treatment mechanism that reorders
CN111030927A (en)*2019-11-202020-04-17中国人民解放军国防科技大学Network-on-chip routing method and network router with sequential perception
CN112165457A (en)*2020-09-042021-01-01苏州浪潮智能科技有限公司Method, system and device for file rearrangement and readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DANIEL LLORENTE 等: "《Buffer allocation for advanced packet segmentation in Network Processors》", 《2008 INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS》*
LI YAN 等: "《Synthetic Comparison and Management for Data with Multi Storage Types》", 《2019 IEEE 4TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA)》*
朱小勇: "《TCP乱序重排功能对无线网络感知影响研究》", 《中国新通信》*
韩晓鑫: "《TCP段乱序重排的硬件设计与实现》", 《信息技术》*

Cited By (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113300818A (en)*2021-02-082021-08-24阿里巴巴集团控股有限公司Data transmission system and method
CN115914126A (en)*2021-09-082023-04-04联发科技(新加坡)私人有限公司 Method for managing out-of-sequence data packets and user equipment thereof
CN114090484A (en)*2021-11-152022-02-25深圳云豹智能有限公司Remote direct data access method and device
CN114090484B (en)*2021-11-152023-08-08深圳云豹智能有限公司Remote direct data access method and device
CN114817082A (en)*2022-03-272022-07-29西安电子科技大学Out-of-order reassembly and accurate retransmission request method, system and terminal for large packet data
CN114866343A (en)*2022-07-042022-08-05支付宝(杭州)信息技术有限公司Data processing method and device
WO2024022243A1 (en)*2022-07-262024-02-01中兴通讯股份有限公司Data transmission method, network device, computer device, and storage medium
CN116708280A (en)*2023-08-082023-09-05合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室)Data center network multipath transmission method based on disorder tolerance
CN116708280B (en)*2023-08-082023-10-24合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室)Data center network multipath transmission method based on disorder tolerance
CN117176809A (en)*2023-09-012023-12-05中科驭数(北京)科技有限公司Data interaction method and system
CN116915370A (en)*2023-09-142023-10-20珠海星云智联科技有限公司Data retransmission method, device and system based on remote direct data access
CN116915370B (en)*2023-09-142023-12-19珠海星云智联科技有限公司Data retransmission method, device and system based on remote direct data access

Also Published As

Publication numberPublication date
CN111711566B (en)2021-07-27

Similar Documents

PublicationPublication DateTitle
CN111711566B (en) Receiver out-of-order rearrangement method in multi-path routing scenario
US11855881B2 (en)System and method for facilitating efficient packet forwarding using a message state table in a network interface controller (NIC)
EP1949622B1 (en)Method and system to reduce interconnect latency
CN109936510B (en) Multipath RDMA transport
US10148581B2 (en)End-to-end enhanced reliable datagram transport
US6493343B1 (en)System and method for implementing multi-pathing data transfers in a system area network
TWI220832B (en)A scheme to prevent HFN un-synchronization for UM RLC in a high speed wireless communication system
CN113194509B (en) A QoS-based multi-network converged transmission system and transmission method
US20030135640A1 (en)Method and system for group transmission and acknowledgment
CN102118434A (en)Data packet transmission method and device
CN110022261A (en)Multi-path transmission method and apparatus based on SCTP-CMT transport protocol
CN114363260A (en) A data flow scheduling method for data center network
EP3563535B1 (en)Transmission of messages by acceleration components configured to accelerate a service
WO2024125098A1 (en)Data transmission method and apparatus, and device and computer-readable storage medium
US6621829B1 (en)Method and apparatus for the prioritization of control plane traffic in a router
He et al.Roud: Scalable rdma over ud in lossy data center networks
CN118869777A (en) A RDMA network data processing method and system based on driver middleware
CN117879768A (en) A RDMA long-distance communication retransmission method and system
CN115669144B (en)Method and apparatus for efficient packet transmission
CN114827300A (en)Hardware-guaranteed data reliable transmission system, control method, equipment and terminal
US9628397B2 (en)Communication device and related packet processing method
CN118764450B (en) A packet granularity load balancing method and system
WO2022056791A1 (en)Packet retransmission method and apparatus
SyamN-Redundant UDP-a fault tolerance scheme for real-time notification
Shen et al.A Lightweight Routing Layer Using a Reliable Link-Layer Protocol

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp