CN115134307B

Movatterモバイル変換

Info

Publication number: CN115134307B
Application number: CN202210740895.7A
Authority: CN
Inventors: 王静; 刘颖; 胡晋彬; 王进
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2024-01-26
Anticipated expiration: 2042-06-27
Also published as: CN115134307A

Abstract

The invention discloses a load balancing method based on packet loss rate coding in cloud computing, and relates to the technical field of data processing. The switch randomly scatters the encoded packet to all parallel paths. The receiving end receives a certain number of coded packets and decodes the coded packets to obtain source packets. The invention carries out forward error correction coding on the source packet by detecting the packet loss probability, randomly scatters the coded packet to all paths, and can effectively reduce the probability of TCP retransmission timeout caused by packet loss in a random packet scattering mechanism while balancing the load.

Description

Translated fromChinese

云计算中基于分组丢失率编码的负载均衡方法Load balancing method based on packet loss rate coding in cloud computing

技术领域Technical field

本发明涉及数据处理技术领域，尤其指一种云计算中基于分组丢失率编码的负载均衡方法。The present invention relates to the field of data processing technology, and in particular, to a load balancing method based on packet loss rate coding in cloud computing.

背景技术Background technique

现代云计算应用程序，如数据挖掘、大数据分析、web搜索和深度学习等有低延时和高吞吐率的严苛需求。为了满足应用程序低延时和高吞吐率的需求，边缘云计算将云计算的能力扩展到网络边缘，即通过在分布式边缘数据中心放置存储和计算资源来提供更接近用户的服务，这些数据中心通过高速网络与大型数据中心相连，同时通过各种低速网络与客户连接。边缘数据中心网络通常在发送端和接收端之间部署大规模等价并行路径，以提供高平分带宽，支持高效传输。数据中心广泛部署了胖树结构和叶-脊结构等网络拓扑。Modern cloud computing applications, such as data mining, big data analysis, web search, and deep learning, have stringent requirements for low latency and high throughput. In order to meet the requirements of low latency and high throughput of applications, edge cloud computing extends the capabilities of cloud computing to the edge of the network, that is, by placing storage and computing resources in distributed edge data centers to provide services closer to users. The center is connected to large data centers through high-speed networks and connected to customers through various low-speed networks. Edge data center networks typically deploy large-scale equal-cost parallel paths between the sender and receiver to provide high bisection bandwidth and support efficient transmission. Network topologies such as fat-tree structure and leaf-spine structure are widely deployed in data centers.

近年来，数据中心提出了一种基于分组的负载均衡方案，称为随机分组散射机制(RPS)。RPS在发送端和接收端之间的所有并行路径中随机散射分组，以充分利用带宽资源。通过这种方式，分组被同时传输到多条路径，流可以更快地完成。然而，RPS在多对一的突发流量场景下，可能会加剧分组丢失甚至重传超时，导致性能不佳。在边缘数据中心中，突发流量非常常见。比如在分布式深度学习训练中，大量工作节点同时将训练梯度传输到聚合服务器，以在每次迭代结束时更新全局模型参数。在这种突发拥塞情况下，极易发生分组丢失。在最坏的情况下，TCP协议的拥塞窗口的全部分组丢失或流的最后三个尾部分组之一丢失都会导致TCP的重传超时(RTO)。RTO的默认值为200毫秒，而数据中心的往返时间(RTT)仅为数十或数百微秒，这意味着在成功重新传输丢失的分组之前，在相当长的时间内无法发送新分组。也就是说，超时将导致吞吐率急剧下降。因此，如何减少丢包重传甚至超时，从而降低流完成时间，提升应用性能，是一个亟待解决的问题。In recent years, data centers have proposed a packet-based load balancing scheme called random packet scattering (RPS). RPS randomly scatters packets in all parallel paths between the sender and receiver to make full use of bandwidth resources. This way, packets are transmitted to multiple paths simultaneously and the flow can complete faster. However, in many-to-one burst traffic scenarios, RPS may aggravate packet loss or even retransmission timeout, resulting in poor performance. Burst traffic is very common in edge data centers. For example, in distributed deep learning training, a large number of worker nodes simultaneously transmit training gradients to an aggregation server to update global model parameters at the end of each iteration. In this sudden congestion situation, packet loss is very likely to occur. In the worst case, the loss of all packets of the TCP protocol's congestion window or the loss of one of the last three tail packets of the flow will cause TCP's retransmission timeout (RTO). The default value for RTO is 200 milliseconds, while data center round-trip times (RTT) are only tens or hundreds of microseconds, which means that new packets cannot be sent for a considerable period of time before the lost packets are successfully retransmitted. That is, timeouts will cause throughput to drop dramatically. Therefore, how to reduce packet loss, retransmission or even timeout, thereby reducing flow completion time and improving application performance, is an urgent problem to be solved.

发明内容Contents of the invention

为了减小在数据中心网络环境下随机分组散射机制在突发流量场景下的分组丢失所造成的TCP重传超时的概率，本发明提供一种云计算中基于分组丢失率编码的负载均衡方法。In order to reduce the probability of TCP retransmission timeout caused by packet loss caused by random packet scattering mechanism in burst traffic scenarios in a data center network environment, the present invention provides a load balancing method based on packet loss rate coding in cloud computing.

为了解决上述技术问题，本发明采用如下技术方法：一种云计算中基于分组丢失率编码的负载均衡方法，包括：In order to solve the above technical problems, the present invention adopts the following technical method: a load balancing method based on packet loss rate coding in cloud computing, including:

一、发送端按以下步骤进行操作：1. The sending end performs the following steps:

步骤S11，初始化基础往返延时RTT、分组丢失率更新周期阈值T_th、分组丢失率P、分组丢失率更新周期的起始时间t的值；令编码冗余分组数量为r、TCP拥塞窗口CWND内源分组数量为w；Step S11, initialize the values of the basic round trip delay RTT, the packet loss rate update cycle threshold T_th , the packet loss rate P, and the starting time t of the packet loss rate update cycle; let the number of coded redundant packets be r, and the TCP congestion window CWND The number of endogenous groups is w;

步骤S12，判断当前时间与分组丢失率更新周期的起始时间t的差值是否大于或等于分组丢失率更新周期T_th，如是，则更新分组丢失率P，其值为分组丢失率更新周期T_th内未接收到的确认分组数量与发送的分组总数量的比值，并将分组丢失率更新周期的起始时间t设置为当前时间；否则转步骤S13；Step S12, determine whether the difference between the current time and the starting time t of the packet loss rate update period is greater than or equal to the packet loss rate update period T_th . If so, update the packet loss rate P, whose value is the packet loss rate update period T. The ratio of the number of acknowledgment packets not received_within th to the total number of packets sent, and the starting time t of the packet loss rate update cycle is set to the current time; otherwise, go to step S13;

步骤S13，根据分组丢失率P计算编码冗余分组数量r，并对TCP拥塞窗口CWND内的w个源分组进行随机线性组合编码，生成w+r个编码分组；Step S13, calculate the number of coding redundant packets r according to the packet loss rate P, and perform random linear combination coding on w source packets within the TCP congestion window CWND to generate w+r coded packets;

步骤S14，发送步骤S13得到的编码分组给交换机，并判断分组是否发送完毕，如是则结束，否则转至步骤S12；Step S14: Send the coded packet obtained in step S13 to the switch, and determine whether the packet has been sent. If so, end; otherwise, go to step S12;

二、交换机按以下步骤进行操作：2. Follow the following steps to operate the switch:

步骤S21，接收编码分组；Step S21, receive the coded packet;

步骤S22，随机选择一个转发出端口转发该编码分组，转步骤S21；Step S22, randomly select a forwarding port to forward the encoded packet, and then go to step S21;

三、接收端按以下步骤进行操作：3. The receiving end performs the following steps:

步骤S31，接收编码分组；Step S31, receive the coded packet;

步骤S32，判断接收的编码分组数量是否等于TCP拥塞窗口CWND内源分组的数量，如是，则解码出源分组并将源分组提交到上层TCP层和应用层，转步骤S33；否则转步骤S31；Step S32, determine whether the number of received coded packets is equal to the number of source packets in the TCP congestion window CWND. If so, decode the source packet and submit the source packet to the upper TCP layer and application layer, and then go to step S33; otherwise, go to step S31;

步骤S33，发送确认分组ACK给发送端，转步骤S31。Step S33: Send the acknowledgment packet ACK to the sending end, and then go to step S31.

进一步地，在步骤S11中，初始化时，基础往返延时RTT设置为50μs；分组丢失率更新周期阈值T_th设置为2RTT；分组丢失率P、分组丢失率更新周期的起始时间t设置为0。Further, in step S11, during initialization, the basic round-trip delay RTT is set to 50 μs; the packet loss rate update cycle threshold T_th is set to 2RTT; the packet loss rate P and the starting time t of the packet loss rate update cycle are set to 0 .

再进一步地，在步骤S13中，所述编码冗余分组数量r的计算公式如下：Furthermore, in step S13, the calculation formula of the number r of coding redundant packets is as follows:

r≥w/(1-P)-w (1)r≥w/(1-P)-w (1)

为了减少传输的冗余分组数量，向上取式(1)中w/(1-P)-w的最小整数为编码冗余分组数量r的值。In order to reduce the number of transmitted redundant packets, the minimum integer of w/(1-P)-w in equation (1) is taken upward to be the value of the number r of encoded redundant packets.

更进一步地，在步骤S13中，TCP拥塞窗口CWND内w个源分组采用随机线性组合编码方法编码为w+r个编码分组，编码公式为：Furthermore, in step S13, w source packets within the TCP congestion window CWND are encoded into w+r coded packets using the random linear combination coding method. The coding formula is:

a＝mH (2)a=mH (2)

其中，ɑ是编码分组，m是源分组，H是生成矩阵，用矩阵表示为：Among them, ɑ is the coding group, m is the source group, and H is the generating matrix, expressed as a matrix:

其中，[a₁ a₂ ... a_w+r]是编码得到的w+r个编码分组构成的矩阵，[m₁ m₂ ... m_w]是由w个源分组构成的矩阵，H＝(h_i,j)_w×(w+r)为w×(w+r)的生成矩阵，h_i,j为生成矩阵中第i行第j列的元素，i＝1,2,…,w；j＝1,2,…,w+r；定义生成矩阵H＝(h_i,j)_w×(w+r)，当j≤w时，若i≤j，则h_i,j＝1，若i＞j，则h_i,j＝0；当j＞w时，h_i,j＝i-1+j-w，表示为：Among them, [a₁ a₂ ... a_w+r ] is a matrix composed of w+r coding groups obtained by encoding, [m₁ m₂ ... m_w ] is a matrix composed of w source groups, H=(h_i,j )_w×(w+r) is the generating matrix of w×(w+r), h_i,j is the element of row i and column j in the generating matrix, i=1,2, ...,w;j=1,2,...,w+r; define the generating matrix H=(h_i,j )_w×(w+r) , when j≤w, if i≤j, then h_{i, j} = 1, if i>j, then h_i,j = 0; when j>w, h_i,j =i-1+jw, expressed as:

优选地，在步骤S32中，接收端接收到任意w个编码分组后，利用以下解码公式解码得到w个源分组：Preferably, in step S32, after receiving any w coded packets, the receiving end decodes and obtains w source packets using the following decoding formula:

其中，[a₁ a₂ ... a_w]是接收端接收到的任意w个编码分组构成的矩阵，h_i,j是发送端生成矩阵H的第j列的第i个元素，i≤w；j＝1,2,,w；j≤w+r；当j≤w时，若i≤j，则h_i,j＝1，若i＞j，则h_i,j＝0；否则，当j＞w时，h_i,j＝i-1+j-w。Among them, [a₁ a₂ ... a_w ] is a matrix composed of any w coded packets received by the receiving end, h_i,j is the i-th element of the j-th column of the generating matrix H at the sending end, i≤ w;j＝1,2,,w;j≤w+r; when j≤w, if i≤j, then h_i,j =1, if i>j, then h_i,j =0; otherwise , when j>w, h_i,j =i-1+jw.

本发明所涉云计算中基于分组丢失率编码的负载均衡方法，通过探测分组丢失概率，对源分组进行前向纠错编码，并将编码分组随机散射到所有路径的方式，有效减小了在数据中心网络环境下随机分组散射机制在突发流量场景下的分组丢失造成的TCP重传超时的概率。具体而言，在该方法中，发送端使用前向纠错编码(FEC)技术对源分组进行编码，并根据发送端探测的分组丢失率动态调整编码分组的冗余。该方法为了适应不同程度的拥塞，CRPS在高丢包率下增加冗余编码分组以减少重传超时概率，否则会减少编码分组以减少冗余流量。然后，交换机在所有可用的并行路径中随机散射编码分组，以提高链路利用率。因此，即使一些编码分组被阻止甚至丢失，只要来自任何并行路径的足够数量的编码分组到达接收端，就可以成功解码源分组以避免分组重传和超时，从而达到减小随机分组散射机制中丢包引起的重传超时的概率的目的。The load balancing method based on packet loss rate coding in cloud computing involved in the present invention effectively reduces the load balancing method by detecting the packet loss probability, performing forward error correction coding on the source packets, and randomly scattering the coded packets to all paths. The probability of TCP retransmission timeout caused by packet loss caused by random packet scattering mechanism in burst traffic scenarios in data center network environment. Specifically, in this method, the sending end uses forward error correction coding (FEC) technology to encode the source packets, and dynamically adjusts the redundancy of the encoded packets according to the packet loss rate detected by the sending end. In order to adapt to different levels of congestion, CRPS adds redundant coding packets under high packet loss rates to reduce retransmission timeout probability. Otherwise, it reduces coding packets to reduce redundant traffic. The switch then randomly scatters the coded packets across all available parallel paths to improve link utilization. Therefore, even if some encoded packets are blocked or even lost, as long as a sufficient number of encoded packets from any parallel path reaches the receiving end, the source packet can be successfully decoded to avoid packet retransmission and timeout, thereby reducing the loss in the random packet scattering mechanism. The purpose of this is the probability of a packet causing a retransmission timeout.

附图说明Description of the drawings

图1为本发明所涉云计算中基于分组丢失率编码的负载均衡方法的流程图；Figure 1 is a flow chart of a load balancing method based on packet loss rate coding in cloud computing involved in the present invention;

图2为本发明实施方式中NS-2网络仿真平台测试场景拓扑图；Figure 2 is a topology diagram of the test scenario of the NS-2 network simulation platform in the embodiment of the present invention;

图3为本发明实施方式中网页搜索和数据挖掘工作负载下的性能测试结果示意图；Figure 3 is a schematic diagram of performance test results under web search and data mining workloads in the embodiment of the present invention;

图4为本发明实施方式中RPS和CRPS在不同并发流数和不同缓存大小的场景下发生Incast的概率。Figure 4 shows the probability of Incast occurring for RPS and CRPS in scenarios with different numbers of concurrent streams and different cache sizes in the embodiment of the present invention.

具体实施方式Detailed ways

为了便于本领域技术人员的理解，下面结合实施例与附图对本发明作进一步的说明，实施方式提及的内容并非对本发明的限定。In order to facilitate the understanding of those skilled in the art, the present invention will be further described below in conjunction with the embodiments and drawings. The contents mentioned in the embodiments do not limit the present invention.

在叙述本发明之前，此处先详细阐述一下本发明的设计思路。本发明为了减少突发流量场景中分组丢失造成的重传超时的概率，拟通过基于编码的随机分组散射机制，称为CRPS，设计一种云计算中基于分组丢失率编码的负载均衡方法，该方法主要依赖于编码来有效地减少尾部延时。进一步来说，发送端使用前向纠错编码(FEC)技术对源分组进行编码，并根据发送端探测的分组丢失率动态调整编码分组的冗余。为了适应不同程度的拥塞，CRPS在高丢包率下增加冗余编码分组以减少重传超时概率，否则会减少编码分组以减少冗余流量。然后，交换机在所有可用的并行路径中随机散射编码分组，以提高链路利用率。因此，即使一些编码分组被阻止甚至丢失，只要来自任何并行路径的足够数量的编码分组到达接收端，就可以成功解码源分组以避免分组重传和超时。尽管冗余编码分组消耗链路带宽，但减少超时概率的好处远远大于编码带来的冗余流量开销。CRPS在终端主机(包括发送端和接收端)的传输层和网络层之间实现，无需修改传输控制协议(TCP)或Internet协议(IP)堆栈。下面具体叙述本发明。Before describing the present invention, the design idea of the present invention is elaborated here. In order to reduce the probability of retransmission timeout caused by packet loss in burst traffic scenarios, this invention intends to design a load balancing method based on packet loss rate coding in cloud computing through a random packet scattering mechanism based on coding, called CRPS. The approach mainly relies on encoding to effectively reduce tail latency. Furthermore, the sending end uses forward error correction coding (FEC) technology to encode the source packets, and dynamically adjusts the redundancy of the encoded packets based on the packet loss rate detected by the sending end. In order to adapt to different levels of congestion, CRPS adds redundant coding packets to reduce retransmission timeout probability under high packet loss rates, otherwise it reduces coding packets to reduce redundant traffic. The switch then randomly scatters the coded packets across all available parallel paths to improve link utilization. Therefore, even if some encoded packets are blocked or even lost, as long as a sufficient number of encoded packets from any parallel path reaches the receiving end, the source packets can be successfully decoded to avoid packet retransmissions and timeouts. Although redundant encoding packets consume link bandwidth, the benefit of reducing timeout probability far outweighs the redundant traffic overhead caused by encoding. CRPS is implemented between the transport layer and the network layer of the end host (including the sender and receiver) without modifying the Transmission Control Protocol (TCP) or Internet Protocol (IP) stack. The present invention will be described in detail below.

参照图1，一种云计算中基于分组丢失率编码的负载均衡方法，主要包括发送端、交换机和接收端三个部分的处理过程，其过程如下：Referring to Figure 1, a load balancing method based on packet loss rate coding in cloud computing mainly includes the processing of three parts: the sending end, the switch and the receiving end. The process is as follows:

首先判断当前操作主体类型。First determine the current operating subject type.

如果是发送端，则按以下步骤进行操作：If it is the sending end, follow these steps:

步骤S11，初始化基础往返延时RTT设置为50μs；分组丢失率更新周期阈值T_th设置为2RTT；分组丢失率P、分组丢失率更新周期的起始时间t均设置为0；令编码冗余分组数量为r、TCP拥塞窗口CWND内源分组数量为w。Step S11, initialize the basic round-trip delay RTT and set it to 50 μs; set the packet loss rate update cycle threshold T_th to 2RTT; set the packet loss rate P and the starting time t of the packet loss rate update cycle to 0; let the coded redundant packet The number is r, and the number of source packets within the TCP congestion window CWND is w.

步骤S12，判断当前时间与分组丢失率更新周期的起始时间t的差值是否大于或等于分组丢失率更新周期T_th，如是，则更新分组丢失率P，其值为分组丢失率更新周期T_th内未接收到的确认分组数量与发送的分组总数量的比值，并将分组丢失率更新周期的起始时间t设置为当前时间；否则转步骤S13。Step S12, determine whether the difference between the current time and the starting time t of the packet loss rate update period is greater than or equal to the packet loss rate update period T_th . If so, update the packet loss rate P, whose value is the packet loss rate update period T. The ratio of the number of acknowledgment packets not received_within th to the total number of packets sent, and the starting time t of the packet loss rate update cycle is set to the current time; otherwise, go to step S13.

步骤S13，根据分组丢失率P计算编码冗余分组数量r，并对TCP拥塞窗口CWND内的w个源分组进行随机线性组合编码，生成w+r个编码分组。为了保证接收端能收到成功解码所需的编码分组数量，需要满足公式：Step S13: Calculate the number of coding redundant packets r according to the packet loss rate P, and perform random linear combination coding on w source packets within the TCP congestion window CWND to generate w+r coded packets. In order to ensure that the receiving end can receive the number of coded packets required for successful decoding, the formula needs to be satisfied:

(1-P)(w+r)≥w；(1-P)(w+r)≥w;

即：r≥w/(1-P)-w (1)That is: r≥w/(1-P)-w (1)

而在对TCP拥塞窗口CWND内w个源分组采用随机线性组合编码方法编码为w+r个编码分组时，编码公式为：When w source packets within the TCP congestion window CWND are encoded into w+r coded packets using the random linear combination coding method, the coding formula is:

a＝mH (2)a=mH (2)

值得一提的是，本发明根据发送端探测的分组丢失率P动态调整编码冗余分组数量r，为了适应不同程度的拥塞，CRPS在高丢包率下增加编码冗余分组数量r以减少重传超时概率，否则会减少编码冗余分组数量r以减少冗余流量。It is worth mentioning that the present invention dynamically adjusts the number r of coding redundant packets according to the packet loss rate P detected by the sending end. In order to adapt to different degrees of congestion, CRPS increases the number r of coding redundant packets under high packet loss rates to reduce redundancy. Transmission timeout probability, otherwise the number r of coding redundant packets will be reduced to reduce redundant traffic.

步骤S14，发送步骤S13得到的编码分组给交换机，并判断分组是否发送完毕，如是则结束，否则转至步骤S12。Step S14: Send the coded packet obtained in step S13 to the switch, and determine whether the packet has been sent. If so, end; otherwise, go to step S12.

如果是交换机，接收到编码分组后随机选择一个转发出端口转发该编码分组，接着继续接收编码分组。交换机在所有可用的并行路径中随机散射编码分组，以提高链路利用率。因此，即使一些编码分组被阻止甚至丢失，只要来自任何并行路径的足够数量的编码分组到达接收端，就可以成功解码源分组以避免分组重传和超时。If it is a switch, after receiving the encoded packet, it randomly selects a forwarding port to forward the encoded packet, and then continues to receive the encoded packet. The switch randomly scatters coded packets across all available parallel paths to increase link utilization. Therefore, even if some encoded packets are blocked or even lost, as long as a sufficient number of encoded packets from any parallel path reaches the receiving end, the source packets can be successfully decoded to avoid packet retransmissions and timeouts.

如果是接收端，接收到编码分组后，判断接收的编码分组数量是否等于TCP拥塞窗口CWND，如不是，则继续接收编码分组，如是，则解码出源分组并将源分组提交到上层TCP层和应用层，发送确认分组ACK给发送端，接着继续接收编码分组。此处，接收端接收到任意w个编码分组后，利用以下解码公式解码得到w个源分组：If it is the receiving end, after receiving the encoded packets, it determines whether the number of received encoded packets is equal to the TCP congestion window CWND. If not, it continues to receive the encoded packets. If so, it decodes the source packets and submits the source packets to the upper TCP layer and The application layer sends a confirmation packet ACK to the sending end, and then continues to receive encoded packets. Here, after the receiving end receives any w coded packets, it uses the following decoding formula to decode and obtain w source packets:

其中，[a₁ a₂ ... a_w]是接收端接收到的任意w个编码分组构成的矩阵，h_i,j是发送端生成矩阵H的第j列的第i个元素，i≤w；j＝1,2,…,w；j≤w+r；当j≤w时，若i≤j，则h_i,j＝1，若i＞j，则h_i,j＝0；否则，当j＞w时，h_i,j＝i-1+j-w。Among them, [a₁ a₂ ... a_w ] is a matrix composed of any w coded packets received by the receiving end, h_i,j is the i-th element of the j-th column of the generating matrix H at the sending end, i≤ w; j＝1,2,…,w; j≤w+r; when j≤w, if i≤j, then h_i,j =1, if i>j, then h_i,j =0; Otherwise, when j>w, h_i,j =i-1+jw.

本发明通过前述基于编码的随机分组散射机制，可以有效减少突发流量场景中分组丢失造成的重传超时的概率，具体的，在数据中心网络环境下采用本发明所涉负载均衡方法后发生Incast的概率为P′_Incast，其采用如下方法进行计算。Through the aforementioned coding-based random packet scattering mechanism, the present invention can effectively reduce the probability of retransmission timeout caused by packet loss in burst traffic scenarios. Specifically, Incast occurs after the load balancing method involved in the present invention is adopted in a data center network environment. The probability of is P′_Incast , which is calculated using the following method.

1)先采用如下公式计算丢包概率P_l′；1) First use the following formula to calculate the packet loss probability P_l ′;

其中，C为链路带宽，RTT_min为最小往返延时，MSS为一个TCP拥塞窗口CWND分组的大小，B为缓存大小，n为TCP拥塞窗口CWND的流数量。Among them, C is the link bandwidth, RTT_min is the minimum round-trip delay, MSS is the size of a TCP congestion window CWND packet, B is the cache size, and n is the number of flows in the TCP congestion window CWND.

2)再采用如下公式计算一条流因尾部最后三个包丢失导致超时的概率P′_tto；2) Then use the following formula to calculate the probability P′_tto of a flow causing timeout due to the loss of the last three packets at the end;

其中，R为一条流的数据包传输的轮数，WR是指第R轮传输的TCP拥塞窗口CWND大小，rR是指WR窗口的数据包对应的编码冗余分组数量，W_i是指第i轮传输的TCP拥塞窗口CWND大小，r_i是指W_i窗口的数据包对应的编码冗余分组数量。Among them, R is the number of rounds of data packet transmission for a flow, WR refers to the TCP congestion window CWND size of the R-th round of transmission, rR refers to the number of coding redundant packets corresponding to the data packets in the WR window, and_Wi refers to the i-th The TCP congestion window CWND size of round transmission, r_i refers to the number of coded redundant packets corresponding to the data packets in the_Wi window.

3)接着采用如下公式计算一条流传输过程中TCP拥塞窗口CWND的数据包全窗丢失导致的超时概率P′_wto；3) Then use the following formula to calculate the timeout probability P′_wto caused by the full window loss of data packets in the TCP congestion window CWND during a stream transmission process;

4)再接着采用如下公式计算一条流的超时概率P′_to；4) Then use the following formula to calculate the timeout probability P′_to of a flow;

P′_toas P′_to＝P′_tto+P′_wto (6)P′_to as P′_to ＝P′_tto +P′_wto (6)

5)最后采用如下公式计算发生Incast的概率P′_Incast；5) Finally, use the following formula to calculate the probability P′_Incast of Incast;

P′_Incast＝1-(1-P′_to)ⁿ (7)。P′_Incast =1-(1-P′_to )ⁿ (7).

为了验证本发明的有效性，接下来利用NS-2网络仿真平台来对本发明进行性能测试。In order to verify the effectiveness of the present invention, the NS-2 network simulation platform is used to conduct performance testing of the present invention.

图2为NS-2网络仿真平台测试场景拓扑图，实验采用叶-脊网络拓扑结构，整个网络有8台叶交换机、8台脊交换机和40台终端主机(1台终端主机同时作为1个发送端和1个接收端)，每个叶交换机分别与40台终端主机通信。所有链路带宽为40Gbps，每条链路的网络传播延时为5微秒。交换机缓存为256个分组。Figure 2 shows the topology diagram of the NS-2 network simulation platform test scenario. The experiment adopted a leaf-spine network topology. The entire network has 8 leaf switches, 8 spine switches and 40 terminal hosts (one terminal host serves as a transmitter at the same time). end and 1 receiving end), each leaf switch communicates with 40 end hosts. The bandwidth of all links is 40Gbps, and the network propagation delay of each link is 5 microseconds. The switch cache is 256 packets.

实验中使用TCP作为默认传输协议。实验生成两种典型的数据中心工作负载，网页搜索和数据挖掘。在网页搜索场景下，约62％的流小于100KB，平均流大小为1.6MB。在数据挖掘场景下，约83％小于100KB的流只提供不到5％的流量。两种实际工作负载下的所有流都是在随机选择的服务器对之间生成的，流的到达时间服从泊松分布。实验过程中将网络平均负载从0.3增加到0.7。TCP is used as the default transport protocol in the experiment. The experiment generates two typical data center workloads, web search and data mining. In the web search scenario, about 62% of the streams are smaller than 100KB, and the average stream size is 1.6MB. In the data mining scenario, about 83% of flows smaller than 100KB provide less than 5% of the traffic. All flows under both real-life workloads are generated between randomly selected server pairs, and the arrival times of the flows follow a Poisson distribution. The average network load was increased from 0.3 to 0.7 during the experiment.

图3为网页搜索和数据挖掘工作负载下的性能测试结果示意图，其中，图3(a)为网页搜索平均流完成时间，图3(b)为网页搜索95分位流完成时间，图3(c)为网页搜索99分位流完成时间，图3(d)为数据挖掘平均流完成时间，图3(e)为数据挖掘95分位流完成时间，图3(f)为数据挖掘99分位流完成时间。图3中，共涉及RPS、DRILL、TLB三种传统的负载均衡机制，以及一种本发明提供的基于分组丢失率编码的负载均衡方法CRPS，其中，RPS表示(RandomPacket Spray)一种随机包散射负载均衡机制，DRILL表示(DistributedRandomized In-network LocalizedLoad-balancing).)一种分布式随机的网内交换机本地负载均衡机制，TLB表示(Traffic-awareloadbalancing)一种流量感知的负载均衡机制，与其他负载均衡机制相比，本发明在负载增加的情况下，CRPS成功地减少了丢包的负面影响和其引起的超时概率，CRPS始终获得了最低的流完成时间。这是因为CRPS有效地将源分组编码后传输，只要接收端从非拥塞路径接收到一定数量的编码分组就可以成功的解码出原分组，避免了丢包引起的重传甚至超时。此外，CRPS能够探测分组丢失率的变化，并相应地调整冗余编码分组以减少尾部延时。虽然数据挖掘工作负载中的流大小是典型的重尾分布，但与其他三种方案相比，CRPS还确保了最小的尾部延时。原因是，即使CRPS像其他机制一样将数据包散射到所有并行路径，总有一些编码分组会经过非拥塞的路径到达接收端，从而成功解码出源分组。CRPS显著提高了重负载场景下的延时性能。Figure 3 is a schematic diagram of the performance test results under web search and data mining workloads. Figure 3(a) is the average web search flow completion time, Figure 3(b) is the web search 95th percentile flow completion time, and Figure 3( c) is the completion time of the 99th percentile flow of web search, Figure 3(d) is the average flow completion time of data mining, Figure 3(e) is the completion time of the 95th percentile flow of data mining, Figure 3(f) is the 99th percentile flow completion time of data mining Bit stream completion time. In Figure 3, there are three traditional load balancing mechanisms: RPS, DRILL, and TLB, as well as a load balancing method CRPS based on packet loss rate coding provided by the present invention, where RPS represents (RandomPacket Spray) a kind of random packet scattering Load balancing mechanism, DRILL stands for (DistributedRandomized In-network LocalizedLoad-balancing).) A distributed random local load balancing mechanism of switches in the network, TLB stands for (Traffic-awareloadbalancing) a traffic-aware load balancing mechanism, and other loads Compared with the balancing mechanism, the present invention successfully reduces the negative impact of packet loss and the timeout probability caused by CRPS when the load increases, and CRPS always obtains the lowest flow completion time. This is because CRPS effectively encodes the source packet before transmitting it. As long as the receiving end receives a certain number of encoded packets from a non-congested path, it can successfully decode the original packet, avoiding retransmissions or even timeouts caused by packet loss. In addition, CRPS is able to detect changes in packet loss rate and adjust redundant coded packets accordingly to reduce tail delay. Although the flow size in data mining workloads is a typical heavy-tailed distribution, CRPS also ensures minimal tail latency compared to the other three schemes. The reason is that even if CRPS scatters packets to all parallel paths like other mechanisms, there will always be some encoded packets that will reach the receiver via non-congested paths and thus successfully decode the source packets. CRPS significantly improves latency performance in heavy load scenarios.

图4为RPS和CRPS在不同并发流数和不同缓存大小的场景下发生Incast的概率。如图4所示，通过改变流数量和缓冲区大小，Incast发生的概率随着流数的增加而增加。其中，CRPS机制下发生Incast的概率的增长量远低于RPS。原因是，当超时主要是由少量流的尾部数据包丢失引起时，编码数据包的积极作用占主导地位，因为编码数据包可以及时恢复丢失的数据包以避免超时。然而，当流量继续增加时，CRPS发生Incast的概率的增长量将高于RPS，因为冗余编码数据包的流量开销降低了数据包编码的好处。最后，当流量超过50时，CRPS和RPS的增量概率均达到1。在缓冲区大小为100个数据包的情况下，由于缓冲区空间较小，在RPS和CRPS中，Incast出现得较早。总之，在RPS和CRPS下，缓冲区的大小越小，在相同的流数下，Incast越早发生。然而，与RPS相比，CRPS达到更低的Incas概率。Figure 4 shows the probability of Incast occurring for RPS and CRPS in scenarios with different numbers of concurrent streams and different cache sizes. As shown in Figure 4, by changing the number of streams and buffer size, the probability of Incast occurrence increases as the number of streams increases. Among them, the increase in the probability of Incast under the CRPS mechanism is much lower than that of RPS. The reason is that when the timeout is mainly caused by the loss of tail packets of a small number of flows, the positive effect of encoding packets dominates because the encoded packets can recover the lost packets in time to avoid timeouts. However, when the traffic continues to increase, the increase in the probability of Incast occurrence for CRPS will be higher than that for RPS because the traffic overhead of redundantly encoded packets reduces the benefits of packet encoding. Finally, when the traffic exceeds 50, the incremental probability of both CRPS and RPS reaches 1. In case of buffer size of 100 packets, Incast occurs earlier in RPS and CRPS due to smaller buffer space. In short, under RPS and CRPS, the smaller the buffer size, the earlier the Incast occurs under the same number of streams. However, CRPS achieves lower Incas probability compared to RPS.

由此可知，相对于传统的方法而言，本发明所涉云计算中基于分组丢失率编码的负载均衡方法，通过探测分组丢失概率，对源分组进行前向纠错编码，并将编码分组随机散射到所有路径的方式，能有效减小在数据中心网络环境下随机分组散射机制在突发流量场景下的分组丢失造成的TCP重传超时的概率。It can be seen from this that, compared with the traditional method, the load balancing method based on packet loss rate coding in cloud computing involved in the present invention performs forward error correction coding on the source packet by detecting the packet loss probability, and randomly encodes the coded packets. The method of scattering to all paths can effectively reduce the probability of TCP retransmission timeout caused by packet loss caused by the random packet scattering mechanism in burst traffic scenarios in a data center network environment.

上述实施例为本发明较佳的实现方案，除此之外，本发明还可以其它方式实现，在不脱离本技术方案构思的前提下任何显而易见的替换均在本发明的保护范围之内。The above embodiments are preferred implementation solutions of the present invention. In addition, the present invention can also be implemented in other ways. Any obvious substitutions are within the protection scope of the present invention without departing from the concept of the technical solution.

为了让本领域普通技术人员更方便地理解本发明相对于现有技术的改进之处，本发明的一些附图和描述已经被简化，并且为了清楚起见，本申请文件还省略了一些其他元素，本领域普通技术人员应该意识到这些省略的元素也可构成本发明的内容。In order to allow those of ordinary skill in the art to more easily understand the improvements of the present invention over the prior art, some drawings and descriptions of the present invention have been simplified, and for the sake of clarity, some other elements have been omitted in this application document. Those of ordinary skill in the art should realize that these omitted elements may also constitute the content of the present invention.