



技术领域technical field
本发明实施例涉及数据通信领域,并且更具体地,涉及一种数据中心网络中的选路方法和通信设备。Embodiments of the present invention relate to the field of data communication, and more specifically, relate to a method for selecting a route and a communication device in a data center network.
背景技术Background technique
胖树架构是数据中心(Data Center,DC)网络的一种重要部署方案,其特点是每个接入层的交换机和每个汇聚层(或作为核心层)的交换机之间都是相互连接的。因此任意两个服务器(Server)之间要进行通信,可以有多条等价路径(Equal Cost Multi Path,ECMP)以供选择。The fat tree architecture is an important deployment scheme of the data center (Data Center, DC) network, and its characteristic is that the switches of each access layer and the switches of each aggregation layer (or as the core layer) are connected to each other . Therefore, to communicate between any two servers (Server), there can be multiple equal cost paths (Equal Cost Multi Path, ECMP) to choose from.
目前,对于等价路径的选路,一般是通过固定的报文特征区域的取值进行哈希(Hash)操作,如按字节求和、按位与等。再根据哈希结果(1字节)选择相应路径,例如将哈希结果除以等价路径的数量,取余数作为选中的路径。上述报文特征区域的例子包括源MAC(Media Access Control,媒体访问控制)地址、目的MAC地址、源IP(Internet Protocol,互联网协议)地址、目的IP地址等。At present, for the route selection of the equal-cost path, the hash (Hash) operation is generally performed through the value of the fixed message characteristic area, such as summation by byte, bitwise AND, and the like. Then select the corresponding path according to the hash result (1 byte), for example, divide the hash result by the number of equivalent paths, and take the remainder as the selected path. Examples of the above-mentioned message characteristic area include a source MAC (Media Access Control, media access control) address, a destination MAC address, a source IP (Internet Protocol, Internet Protocol) address, a destination IP address, and the like.
以报文的源MAC地址字段加目的MAC地址字段进行哈希操作为例。报文的源MAC地址表示发出该报文的服务器的网卡的MAC地址,报文的目的MAC地址表示接收该报文的服务器的网卡的MAC地址。因为服务器的MAC地址基本上都是固定的,所以对源MAC字段加目的MAC字段进行哈希操作的结果也是固定的。换句话说,哈希选路的结果,实际上都选到了同一条路径。这样,由于哈希选路的不均匀性,导致其他等价路径的带宽被浪费了。Take the hash operation performed on the source MAC address field plus the destination MAC address field of a packet as an example. The source MAC address of the message indicates the MAC address of the network card of the server that sends the message, and the destination MAC address of the message indicates the MAC address of the network card of the server that receives the message. Because the MAC address of the server is basically fixed, the result of hashing the source MAC field plus the destination MAC field is also fixed. In other words, the result of hash routing actually selects the same path. In this way, the bandwidth of other equal-cost paths is wasted due to the non-uniformity of hash routing.
发明内容Contents of the invention
本发明实施例提供一种数据中心网络中的选路方法和通信设备,能够解决由于哈希选路的不均匀性引起的路径带宽浪费问题。Embodiments of the present invention provide a routing method and communication equipment in a data center network, which can solve the problem of path bandwidth waste caused by non-uniform hash routing.
本发明实施例一方面,提供了一种数据中心网络中的选路方法,包括:接收转发路径上的下游设备发送的流控信号,该流控信号是由该下游设备在检测到该转发路径发生拥塞后生成的,该流控信号用于向上游设备通知该拥塞;根据该流控信号,在选路时暂停选择该转发路径。In one aspect of the embodiments of the present invention, a method for selecting a path in a data center network is provided, including: receiving a flow control signal sent by a downstream device on the forwarding path, the flow control signal is detected by the downstream device after the forwarding path Generated after congestion occurs, the flow control signal is used to notify the upstream device of the congestion; according to the flow control signal, the selection of the forwarding path is suspended during route selection.
本发明实施例第二方面,提供了一种数据中心网络中的选路方法,包括:在检测到与上游设备之间的转发路径拥塞时,生成流控信号,该流控信号用于向上游设备通知该拥塞;向该上游设备发送该流控信号,以使该上游设备根据该流控信号,在选路时暂停选择该转发路径。According to the second aspect of the embodiment of the present invention, there is provided a route selection method in a data center network, including: when detecting that the forwarding path between the upstream device is congested, generating a flow control signal, the flow control signal is used to forward The device notifies the congestion; and sends the flow control signal to the upstream device, so that the upstream device suspends selecting the forwarding path during route selection according to the flow control signal.
本发明实施例第三方面,提供了一种数据中心网络中的通信设备,包括:流控模块,用于接收转发路径上的下游设备发送的流控信号,该流控信号是由该下游设备在检测到该转发路径发生拥塞后生成的,该流控信号用于向上游设备通知该拥塞;选路模块,用于根据该流控模块接收的流控信号,在选路时暂停选择该转发路径。In the third aspect of the embodiments of the present invention, a communication device in a data center network is provided, including: a flow control module, configured to receive a flow control signal sent by a downstream device on the forwarding path, and the flow control signal is generated by the downstream device Generated after detecting that the forwarding path is congested, the flow control signal is used to notify the upstream device of the congestion; the route selection module is used to suspend the selection of the forwarding during route selection according to the flow control signal received by the flow control module path.
本发明实施例第四方面,提供了一种数据中心网络中的通信设备,包括:生成模块,用于在检测到与上游设备之间的转发路径拥塞时,生成流控信号,该流控信号用于向上游设备通知该拥塞;发送模块,用于向该上游设备发送该生成模块生成的流控信号,以使该上游设备根据该流控信号,在选路时暂停选择该转发路径。In the fourth aspect of the embodiments of the present invention, a communication device in a data center network is provided, including: a generating module configured to generate a flow control signal when congestion is detected on a forwarding path with an upstream device, and the flow control signal It is used to notify the upstream device of the congestion; the sending module is used to send the flow control signal generated by the generation module to the upstream device, so that the upstream device suspends selecting the forwarding path according to the flow control signal during route selection.
本发明实施例基于下游设备返回的流控信号,暂停选择拥塞的转发路径,从而能够选择其他等价路径,避免浪费这些等价路径上的带宽资源。In the embodiment of the present invention, based on the flow control signal returned by the downstream device, the selection of the congested forwarding path is suspended, so that other equal-cost paths can be selected, and bandwidth resources on these equal-cost paths can be avoided from being wasted.
附图说明Description of drawings
为了更清楚地说明本发明实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only some of the present invention. Embodiments, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1是可应用本发明实施例的数据中心网络的架构示意图。Fig. 1 is a schematic structural diagram of a data center network to which an embodiment of the present invention can be applied.
图2是本发明一个实施例的数据中心网络中的选路方法的流程图。Fig. 2 is a flowchart of a routing method in a data center network according to an embodiment of the present invention.
图3是本发明另一实施例的数据中心网络中的选路方法的流程图。Fig. 3 is a flowchart of a routing method in a data center network according to another embodiment of the present invention.
图4是本发明一个实施例的应用场景的示意图。Fig. 4 is a schematic diagram of an application scenario of an embodiment of the present invention.
图5是图4的示例场景中的选路过程的示意流程图。FIG. 5 is a schematic flowchart of a routing process in the example scenario of FIG. 4 .
图6是本发明一个实施例的通信设备的框图。Fig. 6 is a block diagram of a communication device according to one embodiment of the present invention.
图7是本发明另一实施例的通信设备的框图。Fig. 7 is a block diagram of a communication device according to another embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
图1是可应用本发明实施例的数据中心网络的架构示意图。图1的数据中心网络采用胖树架构,但本发明实施例不限于此,也可以应用于其他需要进行选路的数据中心网络。Fig. 1 is a schematic structural diagram of a data center network to which an embodiment of the present invention can be applied. The data center network in FIG. 1 adopts a fat tree architecture, but the embodiment of the present invention is not limited thereto, and can also be applied to other data center networks that require routing.
胖树架构是DC网络的一种重要部署方案,其典型的网络拓扑如下图1所示。如图1所示,Svri(i=1,2,...)是服务器(Server),S1x(x=1,2,...)是接入层的交换机或路由器(可分别简称为接入交换机或接入路由器),S2y(y=1,2,...)是汇聚层的交换机或路由器(可分别简称为汇聚交换机或汇聚路由器),或者是核心层的交换机或路由器(可分别简称为核心交换机或核心路由器)。图1中为了简洁起见,只显示了三层结构;但是本发明实施例可应用的DC网络不限于该具体形式的例子,而是可以具有更多层,每一层可以具有比图示更多或更少的节点。The fat tree architecture is an important deployment solution for DC networks, and its typical network topology is shown in Figure 1 below. As shown in Figure 1, Svri (i=1, 2, ...) is a server (Server), and S1x (x = 1, 2, ...) is a switch or router of the access layer (which can be referred to as an access layer for short, respectively). Ingress switch or access router), S2y (y=1, 2, ...) is a switch or router at the aggregation layer (respectively referred to as aggregation switch or aggregation router), or a switch or router at the core layer (respectively referred to as core switch or core router). In Fig. 1, for the sake of brevity, only a three-layer structure is shown; but the applicable DC network of the embodiment of the present invention is not limited to the example of this specific form, but may have more layers, and each layer may have more layers than shown in the figure. or fewer nodes.
从图1中可以看到,每个接入层的交换机S1x和每个汇聚层(或作为核心层)的交换机S2y都是相互连接的,任意两个服务器之间要进行通信,均有多条等价路径可以选择。例如,对于从服务器Svr1到服务器Svr4的流量,在交换机S11上会有4条等价路径,对应分别选择上级的交换机S21、S22、S23、S24。如何进行选路,是胖树架构的一个关键问题。As can be seen from Figure 1, the switch S1x of each access layer and the switch S2y of each aggregation layer (or as the core layer) are connected to each other, and any two servers need to communicate with each other. Equivalent paths can be chosen. For example, for the traffic from the server Svr1 to the server Svr4, there will be 4 equal-cost paths on the switch S11, and correspondingly select upper-level switches S21, S22, S23, and S24 respectively. How to select routes is a key issue in the fat tree architecture.
首先基于图1的例子,简单描述现有哈希选路方法中存在的问题。假设流量还是图1中所示,服务器Svr1流向服务器Svr4,那么在S11交换机进行上述哈希选路的时候,报文的源MAC为服务器Svr1网卡的MAC,报文的目的MAC为服务器Svr4网卡的MAC地址。因为服务器的MAC地址基本上都是固定的,所以哈希结果也是固定的。换句话说,S11哈希选路的结果,实际上都选到了4条链路中的同一条,其它链路的带宽被浪费了。First, based on the example in FIG. 1 , briefly describe the problems existing in the existing hash routing method. Assuming that the traffic is still as shown in Figure 1, and the server Svr1 flows to the server Svr4, then when the S11 switch performs the above hash route selection, the source MAC of the message is the MAC of the network card of the server Svr1, and the destination MAC of the message is the network card of the server Svr4 MAC address. Because the MAC address of the server is basically fixed, the hash result is also fixed. In other words, the result of S11 hash route selection actually selects the same one of the four links, and the bandwidth of other links is wasted.
图2是本发明一个实施例的数据中心网络中的选路方法的流程图。图2的方法可以由DC网络中的上游设备执行。这里的上游设备可以是位于接入层、汇聚层或核心层的需要进行转发路径选择的交换机或路由器。Fig. 2 is a flowchart of a routing method in a data center network according to an embodiment of the present invention. The method in FIG. 2 may be executed by an upstream device in the DC network. The upstream device here may be a switch or router located at the access layer, aggregation layer, or core layer that needs to select a forwarding path.
201,接收转发路径上的下游设备发送的流控信号。该流控信号是由该下游设备在检测到该转发路径发生拥塞后生成的,该流控信号用于向上游设备通知该拥塞。201. Receive a flow control signal sent by a downstream device on a forwarding path. The flow control signal is generated by the downstream device after detecting that the forwarding path is congested, and the flow control signal is used to notify the upstream device of the congestion.
这里的下游设备可以位于接入层、汇聚层或核心层,只需位于上述上游设备的下游即可。The downstream device here can be located at the access layer, aggregation layer or core layer, and only needs to be located downstream of the above-mentioned upstream device.
由于哈希选路的不均匀性,可能导致转发路径发生拥塞。本发明实施例中的“拥塞”不仅包括转发路径在该下游设备处的报文队列(或缓冲器)已满的情况,也包括报文队列(或缓冲器)即将满的情况,例如队列或缓冲器仅剩下低于某一阈值的空间。上述阈值可以是报文数(如2个),也可以是特定百分比(如10%)。本发明实施例对此不做限制。Due to the unevenness of hash path selection, congestion may occur on the forwarding path. The "congestion" in the embodiment of the present invention not only includes the situation that the message queue (or buffer) of the forwarding path at the downstream device is full, but also includes the situation that the message queue (or buffer) is about to be full, such as a queue or The buffer only has space left below a certain threshold. The foregoing threshold may be the number of packets (such as 2), or may be a specific percentage (such as 10%). This embodiment of the present invention does not limit this.
本发明实施例对流控信号的粒度不作限制。例如,流控信号可基于整个端口、基于报文的优先级或者基于其他更细或更粗的粒度(如基于报文/队列的其他特征)。The embodiment of the present invention does not limit the granularity of the flow control signal. For example, the flow control signal can be based on the entire port, based on the priority of the packet, or based on other finer or coarser granularity (such as based on other characteristics of the packet/queue).
202,根据该流控信号,在选路时暂停选择上述转发路径。202. According to the flow control signal, the selection of the above forwarding path is suspended during path selection.
例如,上游设备在选路时,可剔除上述转发路径,即只在上述转发路径的等价路径中进行选择。For example, when selecting a route, the upstream device may exclude the above-mentioned forwarding path, that is, select only the equal-cost paths of the above-mentioned forwarding path.
本发明实施例基于下游设备返回的流控信号,暂停选择拥塞的转发路径,从而能够选择其他等价路径,避免浪费这些等价路径上的带宽资源。In the embodiment of the present invention, based on the flow control signal returned by the downstream device, the selection of the congested forwarding path is suspended, so that other equal-cost paths can be selected, and bandwidth resources on these equal-cost paths can be avoided from being wasted.
图3是本发明另一实施例的数据中心网络中的选路方法的流程图。图3的方法可以由DC网络中的下游设备(如交换机或路由器)执行,并且与图2的方法相对应。Fig. 3 is a flowchart of a routing method in a data center network according to another embodiment of the present invention. The method in FIG. 3 may be executed by a downstream device (such as a switch or a router) in the DC network, and corresponds to the method in FIG. 2 .
301,在检测到与上游设备之间的转发路径拥塞时,生成流控信号,该流控信号用于向上游设备通知该拥塞。301. When congestion is detected on a forwarding path with an upstream device, generate a flow control signal, where the flow control signal is used to notify the upstream device of the congestion.
由于哈希选路的不均匀性,可能导致转发路径发生拥塞。本发明实施例中的“拥塞”不仅包括报文队列(或缓冲器)已满的情况,也包括报文队列(或缓冲器)即将满的情况,例如队列或缓冲器仅剩下低于某一阈值的空间。上述阈值可以是报文数(如2个),也可以是特定百分比(如10%)。本发明实施例对此不做限制。Due to the unevenness of hash path selection, congestion may occur on the forwarding path. "Congestion" in the embodiment of the present invention includes not only the situation that the message queue (or buffer) is full, but also the situation that the message queue (or buffer) is about to be full, for example, there are only A threshold space. The foregoing threshold may be the number of packets (such as 2), or may be a specific percentage (such as 10%). This embodiment of the present invention does not limit this.
本发明实施例对流控信号的粒度不作限制。例如,流控信号可基于整个端口、基于报文的优先级或者基于其他更细或更粗的粒度(如基于报文/队列的其他特征)。The embodiment of the present invention does not limit the granularity of the flow control signal. For example, the flow control signal can be based on the entire port, based on the priority of the packet, or based on other finer or coarser granularity (such as based on other characteristics of the packet/queue).
302,向上游设备发送流控信号,以使上游设备根据流控信号,在选路时暂停选择上述转发路径。302. Send a flow control signal to the upstream device, so that the upstream device suspends selecting the forwarding path during path selection according to the flow control signal.
例如,上游设备在选路时,可剔除上述转发路径,即只在上述路径的等价路径中进行选择。For example, when selecting a route, the upstream device may exclude the above-mentioned forwarding path, that is, select only the equal-cost paths of the above-mentioned paths.
本发明实施例基于下游设备返回的流控信号,暂停选择拥塞的转发路径,从而能够选择其他等价路径,避免浪费这些等价路径上的带宽资源。In the embodiment of the present invention, based on the flow control signal returned by the downstream device, the selection of the congested forwarding path is suspended, so that other equal-cost paths can be selected, and bandwidth resources on these equal-cost paths can be avoided from being wasted.
下面结合图4和图5的具体例子,更加详细地描述本发明图2和图3所述实施例。应注意,这些例子仅仅是为了帮助本领域技术人员更好地理解本发明,而非要限制本发明的范围。The embodiments shown in FIG. 2 and FIG. 3 of the present invention will be described in more detail below in conjunction with specific examples in FIG. 4 and FIG. 5 . It should be noted that these examples are only to help those skilled in the art better understand the present invention, rather than limiting the scope of the present invention.
图4是本发明一个实施例的应用场景的示意图。图5是图4的示例场景中的选路过程的示意流程图。Fig. 4 is a schematic diagram of an application scenario of an embodiment of the present invention. FIG. 5 is a schematic flowchart of a routing process in the example scenario of FIG. 4 .
如图4所示,假设从源服务器Sv1向目的服务器Sv2发送数据,交换机S0是转发路径中的上游设备。有N条等价路径(路径1、路径2、...路径N)经过交换机S0,每条等价路径的下游交换机分别为S1、S2、...SN,N为正整数。应注意,图4中的连接方式只是示意性的,交换机S0和源服务器Sv1之间不必直接相连,也可以通过一层或多层交换机间接相连。另外,交换机S1、S2、...SN和目的服务器Sv2之间不必直接相连,也可以通过一层或多层交换机间接相连。As shown in FIG. 4, assuming that data is sent from the source server Sv1 to the destination server Sv2, the switchS0 is an upstream device in the forwarding path. There are N equal-cost paths (path 1, path 2, ... path N) passing through the switch S0 , and the downstream switches of each equal-cost path are S1 , S2 , ... SN , and N is a positive integer . It should be noted that the connection mode in FIG. 4 is only schematic, and the switchS0 and the source server Sv1 do not have to be directly connected, but can also be indirectly connected through one or more layers of switches. In addition, the switches S1 , S2 , .
图5中以S1发生拥塞(队列已满或即将满)为例进行描述,本发明实施例同样可应用于其他上下游设备之间的路径选择。In FIG. 5 , the congestion occurs inS1 (the queue is full or about to be full) as an example for description, and this embodiment of the present invention can also be applied to path selection between other upstream and downstream devices.
501,上游设备S0通过哈希选路,选中路径1,即选择下游设备S1转发报文。由于哈希选路不均,导致下游设备S1发生拥塞。501. The upstream deviceS0 selects a path through hashing, and selects a path 1, that is, selects the downstream deviceS1 to forward the packet. The downstream device S1 is congested due to uneven hash route selection.
502,下游设备S1检测到路径1发生拥塞后,发送流控信号给上游设备S0。502. After detecting that path 1 is congested, the downstream device S1 sends a flow control signal to the upstream device S0 .
可选地,作为一个实施例,流控信号可基于报文的优先级。基于优先级的流控(PFC,Priority-based Flow Control)标准为802.1Qbb,也称为基于优先级的暂停(Per Priority Pause)或基于类的流控(CBFC,Class Based FlowControl),是一种基于优先级的流控机制。当下游设备端口上某个优先级队列即将产生拥塞时,发送一个PFC“反压”信号到上游设备,上游设备停止发送该优先级队列的报文,该机制逐级传递,直到源设备。该反压只针对某一个或几个优先级,不针对整个端口反压。Optionally, as an embodiment, the flow control signal may be based on packet priority. Priority-based flow control (PFC, Priority-based Flow Control) standard is 802.1Qbb, also known as priority-based pause (Per Priority Pause) or class-based flow control (CBFC, Class Based FlowControl), is a Priority-based flow control mechanism. When a certain priority queue on the port of the downstream device is about to be congested, a PFC "back pressure" signal is sent to the upstream device, and the upstream device stops sending messages of the priority queue, and this mechanism is transmitted step by step until the source device. The back pressure is only for one or several priorities, not for the entire port.
这里不妨假设拥塞的队列中报文的优先级为X。在此情况下,只针对某一个或几个优先级发送反压信号,而不针对整个端口反压,可以提高选路控制的灵活性。Here it may be assumed that the priority of the packets in the congested queue is X. In this case, the back pressure signal is only sent for one or several priorities, instead of back pressure for the entire port, which can improve the flexibility of route selection control.
503,上游设备S0检测到流控信号后,暂停具有优先级X的报文的发送。同时,上游设备S0可记录该流控信号,即生成相应的流控记录。503. After the upstream deviceS0 detects the flow control signal, it suspends sending the packets with priority X. At the same time, the upstream deviceS0 can record the flow control signal, that is, generate a corresponding flow control record.
具体地,上游设备S0可暂停相应队列(该队列中报文的优先级为X)的报文的发送。Specifically, the upstream deviceS0 may suspend sending the packets in the corresponding queue (the priority of the packets in the queue is X).
504,后续为优先级为X的报文进行选路时,上游设备S0将具有流控记录的路径(如图4中的路径1)剔除,即选路只在无流控记录的等价路径(如图4中的路径2至路径N)中进行选择。504. When subsequently selecting a route for a message with a priority of X, the upstream deviceS0 will remove the route with a flow control record (such as route 1 in FIG. Choose from paths (such as path 2 to path N in Figure 4).
不妨假设选择路径2,即选择下游设备S2转发报文。It may be assumed that path 2 is selected, that is, the downstream deviceS2 is selected to forward the message.
505,上游设备S0在针对路径1的流控信号失效后,在选路时将仍允许选择路径1。例如上游设备S0可删除相应的流控记录,恢复路径1被选择的能力。505. After the flow control signal for path 1 fails, the upstream device S0 still allows path 1 to be selected during route selection. For example, the upstream device S0 may delete the corresponding flow control record and restore the ability of path 1 to be selected.
例如,下游设备S1可以在路径1不再拥塞时,向上游设备S0通知针对路径1的流控信号失效。或者,上游设备S0可按照其他机制,自己确定针对路径1的流控信号是否失效。例如,上游设备S0可在接收到或记录针对路径1的流控信息之后设置定时器,当定时器期满时,认为针对路径1的流控信息失效。上述定时器的期满时间可以由下游设备S1在流控信号中通知给上游设备S0。本发明实施例对流控信号失效的发现机制不做限制,只需要上游设备S0在针对路径1的流控信号失效的情况下允许再次选择路径1即可。For example, the downstream device S1 may notify the upstream device S0 that the flow control signal for the path 1 is invalid when the path 1 is no longer congested. Alternatively, the upstream device S0 may determine by itself whether the flow control signal for path 1 is invalid according to other mechanisms. For example, the upstream device S0 may set a timer after receiving or recording the flow control information for path 1, and when the timer expires, the flow control information for path 1 is deemed invalid. The expiry time of the above timer may be notified by the downstream device S1 to the upstream device S0 in a flow control signal. The embodiment of the present invention does not limit the discovery mechanism of the failure of the flow control signal, and only requires that the upstream device S0 is allowed to select the path 1 again when the flow control signal for the path 1 fails.
根据本发明实施例,可以根据下游设备的PFC信息进行路径的动态调整,从而解决胖树架构下,一部分等价路径拥塞,另一部分等价路径却空闲的问题。只要有路径不拥塞,则流量就会优先从不拥塞的路径上转发,有效提高网络带宽的利用率,避免带宽浪费。According to the embodiment of the present invention, the path can be dynamically adjusted according to the PFC information of the downstream device, so as to solve the problem that some equal-cost paths are congested while other equal-cost paths are idle under the fat tree architecture. As long as there is a path that is not congested, the traffic will be preferentially forwarded from the uncongested path, effectively improving the utilization rate of network bandwidth and avoiding bandwidth waste.
因此,如果在网络中的接入交换机(或路由器)、汇聚交换机(路由器)、核心交换机(或路由器)上部署上述的选路方案,则能有效提高网络的负载能力。Therefore, if the above routing scheme is deployed on the access switches (or routers), aggregation switches (routers), and core switches (or routers) in the network, the load capacity of the network can be effectively improved.
图6是本发明一个实施例的通信设备的框图。图6的通信设备60的一个例子是上述上游设备(交换机或路由器),包括流控模块61和选路模块62。Fig. 6 is a block diagram of a communication device according to one embodiment of the present invention. An example of the
流控模块61接收转发路径上的下游设备发送的流控信号。该流控信号是由该下游设备在检测到该转发路径发生拥塞后生成的,该流控信号用于向上游设备通知该拥塞。选路模块62根据流控模块61接收的流控信号,在选路时暂停选择上述转发路径。The
本发明实施例基于下游设备返回的流控信号,暂停选择拥塞的转发路径,从而能够选择其他等价路径,避免浪费这些等价路径上的带宽资源。In the embodiment of the present invention, based on the flow control signal returned by the downstream device, the selection of the congested forwarding path is suspended, so that other equal-cost paths can be selected, and bandwidth resources on these equal-cost paths can be avoided from being wasted.
通信设备60可执行上述图2至图4的方法中涉及上游设备的各个过程,为避免重复不再详细描述。The
可选地,作为一个实施例,流控模块61接收的流控信号可基于报文的优先级(例如,图5的实施例中的一个或多个优先级X),例如该流控信号可以是PFC反压信号。此时,选路模块62可在为具有该优先级(如上述优先级X)的报文进行选路时,暂停选择上述转发路径。在此情况下,只针对某一个或几个优先级发送反压信号,而不针对整个端口反压,可以提高选路控制的灵活性。Optionally, as an embodiment, the flow control signal received by the
可选地,作为另一实施例,流控模块61还可以在选路之前,根据流控信号,暂停具有上述优先级(如上述优先级X)的报文的发送。同时,流控模块61可将该流控信息通知给选路模块62。选路模块62记录该流控信息,即生成相应的流控记录。在后续进行选路时,选路模块62将具有流控记录的路径(如图4中的路径1)剔除,即选路只在无流控记录的等价路径(如图4中的路径2至路径N)中进行选择。Optionally, as another embodiment, the
可选地,作为另一实施例,选路模块62还可以在流控信号失效后,在选路时允许选择上述转发路径。例如选路模块62可删除相应的流控记录,恢复上述转发路径被选择的能力。Optionally, as another embodiment, the
可选地,作为另一实施例,通信设备60可以是接入交换机、接入路由器、汇聚交换机、汇聚路由器、核心交换机、核心路由器中的至少一种。Optionally, as another embodiment, the
图7是本发明另一实施例的通信设备的框图。图7的通信设备70的一个例子是上述下游设备(交换机或路由器),包括生成模块71和发送模块72。Fig. 7 is a block diagram of a communication device according to another embodiment of the present invention. An example of the communication device 70 in FIG. 7 is the above-mentioned downstream device (switch or router), including a generating module 71 and a sending module 72 .
生成模块71在检测到与上游设备之间的转发路径拥塞时,生成流控信号,该流控信号将发生拥塞的转发路径通知上游设备。发送模块72向上游设备发送生成模块71生成的流控信号,以使上游设备根据流控信号,在选路时暂停选择上述转发路径。When the generating module 71 detects that the forwarding path with the upstream device is congested, it generates a flow control signal, and the flow control signal notifies the upstream device of the congested forwarding path. The sending module 72 sends the flow control signal generated by the generation module 71 to the upstream device, so that the upstream device suspends selecting the forwarding path during route selection according to the flow control signal.
本发明实施例基于下游设备返回的流控信号,暂停选择拥塞的转发路径,从而能够选择其他等价路径,避免浪费这些等价路径上的带宽资源。In the embodiment of the present invention, based on the flow control signal returned by the downstream device, the selection of the congested forwarding path is suspended, so that other equal-cost paths can be selected, and bandwidth resources on these equal-cost paths can be avoided from being wasted.
通信设备70可执行上述图2至图4的方法中涉及下游设备的各个过程,为避免重复不再详细描述。The communication device 70 may execute each process involving the downstream device in the above-mentioned methods in FIGS. 2 to 4 , which will not be described in detail to avoid repetition.
可选地,作为一个实施例,生成模块71可生成基于报文的优先级的流控信号(例如上述PFC反压信号),以使上游设备在为具有该优先级的报文进行选路时,暂停选择上述转发路径。在此情况下,只针对某一个或几个优先级发送反压信号,而不针对整个端口反压,可以提高选路控制的灵活性。Optionally, as an embodiment, the generation module 71 may generate a flow control signal based on the priority of the message (such as the above-mentioned PFC back pressure signal), so that when the upstream device selects a route for a message with the priority , to suspend selection of the above forwarding path. In this case, the back pressure signal is only sent for one or several priorities, instead of back pressure for the entire port, which can improve the flexibility of route selection control.
可选地,作为另一实施例,发送模块72还可以在上述路径不再拥塞时,向上游设备通知该流控信号失效,以使上游设备在选路时能够选择上述路径。Optionally, as another embodiment, the sending module 72 may also notify the upstream device that the flow control signal is invalid when the above-mentioned path is no longer congested, so that the upstream device can select the above-mentioned path when selecting a path.
可选地,作为另一实施例,通信设备70可以是接入交换机、接入路由器、汇聚交换机、汇聚路由器、核心交换机、核心路由器中的至少一种。Optionally, as another embodiment, the communication device 70 may be at least one of an access switch, an access router, an aggregation switch, an aggregation router, a core switch, and a core router.
根据本发明实施例的通信网络可包括上述通信设备60或70。A communication network according to an embodiment of the present invention may include the
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011103430385ACN102377672A (en) | 2011-11-03 | 2011-11-03 | Routing method and communication equipment in data center network |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011103430385ACN102377672A (en) | 2011-11-03 | 2011-11-03 | Routing method and communication equipment in data center network |
| Publication Number | Publication Date |
|---|---|
| CN102377672Atrue CN102377672A (en) | 2012-03-14 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2011103430385APendingCN102377672A (en) | 2011-11-03 | 2011-11-03 | Routing method and communication equipment in data center network |
| Country | Link |
|---|---|
| CN (1) | CN102377672A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103517155A (en)* | 2013-10-15 | 2014-01-15 | 浙江宇视科技有限公司 | Flow dynamic control method and device based on monitor service |
| WO2014153780A1 (en)* | 2013-03-29 | 2014-10-02 | 华为技术有限公司 | Network congestion processing method, network node, and network system |
| CN105681130A (en)* | 2016-01-27 | 2016-06-15 | 北京百度网讯科技有限公司 | Transmission control protocol link building method and device |
| CN106254267A (en)* | 2016-08-31 | 2016-12-21 | 浙江宇视科技有限公司 | A kind of data forwarding paths method of adjustment and gateway device |
| CN106603406A (en)* | 2015-10-16 | 2017-04-26 | 中兴通讯股份有限公司 | Method and device of traffic engineering information announcement in BIER network |
| CN111431811A (en)* | 2019-01-10 | 2020-07-17 | 华为技术有限公司 | A message transmission control method, device and network device |
| CN112087380A (en)* | 2019-06-14 | 2020-12-15 | 中兴通讯股份有限公司 | Flow adjusting method and device |
| CN112491700A (en)* | 2020-12-14 | 2021-03-12 | 成都颜创启新信息技术有限公司 | Network path adjusting method, system, device, electronic equipment and storage medium |
| CN117376258A (en)* | 2022-07-01 | 2024-01-09 | 华为技术有限公司 | A method for sending data streams and related devices |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101018203A (en)* | 2007-01-18 | 2007-08-15 | 北京航空航天大学 | Dynamic load allocation method of the aviation telecom network |
| CN101035067A (en)* | 2007-01-25 | 2007-09-12 | 华为技术有限公司 | Flow control implementation method and device based on the output queue |
| US20080310311A1 (en)* | 2007-06-15 | 2008-12-18 | George Flammer | Method and system for providing network and routing protocols for utility services |
| CN102025617A (en)* | 2010-11-26 | 2011-04-20 | 中兴通讯股份有限公司 | Method and device for controlling congestion of Ethernet |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101018203A (en)* | 2007-01-18 | 2007-08-15 | 北京航空航天大学 | Dynamic load allocation method of the aviation telecom network |
| CN101035067A (en)* | 2007-01-25 | 2007-09-12 | 华为技术有限公司 | Flow control implementation method and device based on the output queue |
| US20080310311A1 (en)* | 2007-06-15 | 2008-12-18 | George Flammer | Method and system for providing network and routing protocols for utility services |
| CN102025617A (en)* | 2010-11-26 | 2011-04-20 | 中兴通讯股份有限公司 | Method and device for controlling congestion of Ethernet |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2014153780A1 (en)* | 2013-03-29 | 2014-10-02 | 华为技术有限公司 | Network congestion processing method, network node, and network system |
| CN103517155A (en)* | 2013-10-15 | 2014-01-15 | 浙江宇视科技有限公司 | Flow dynamic control method and device based on monitor service |
| CN106603406B (en)* | 2015-10-16 | 2020-05-26 | 中兴通讯股份有限公司 | Method and device for announcing traffic engineering information in BIER network |
| CN106603406A (en)* | 2015-10-16 | 2017-04-26 | 中兴通讯股份有限公司 | Method and device of traffic engineering information announcement in BIER network |
| CN105681130A (en)* | 2016-01-27 | 2016-06-15 | 北京百度网讯科技有限公司 | Transmission control protocol link building method and device |
| CN105681130B (en)* | 2016-01-27 | 2019-04-19 | 北京百度网讯科技有限公司 | Transmission control protocol link construction method and device |
| CN106254267B (en)* | 2016-08-31 | 2020-01-14 | 浙江宇视科技有限公司 | Data forwarding path adjusting method and gateway equipment |
| CN106254267A (en)* | 2016-08-31 | 2016-12-21 | 浙江宇视科技有限公司 | A kind of data forwarding paths method of adjustment and gateway device |
| CN111431811A (en)* | 2019-01-10 | 2020-07-17 | 华为技术有限公司 | A message transmission control method, device and network device |
| CN111431811B (en)* | 2019-01-10 | 2024-09-17 | 华为技术有限公司 | Message transmission control method, device and network equipment |
| CN112087380A (en)* | 2019-06-14 | 2020-12-15 | 中兴通讯股份有限公司 | Flow adjusting method and device |
| CN112491700A (en)* | 2020-12-14 | 2021-03-12 | 成都颜创启新信息技术有限公司 | Network path adjusting method, system, device, electronic equipment and storage medium |
| CN117376258A (en)* | 2022-07-01 | 2024-01-09 | 华为技术有限公司 | A method for sending data streams and related devices |
| Publication | Publication Date | Title |
|---|---|---|
| CN102377672A (en) | Routing method and communication equipment in data center network | |
| CN109981471B (en) | Method, equipment and system for relieving congestion | |
| US9294343B2 (en) | System and method using RSVP hello suppression for graceful restart capable neighbors | |
| CN104219107B (en) | A kind of detection method of communication failure, apparatus and system | |
| US9350665B2 (en) | Congestion mitigation and avoidance | |
| US9692697B2 (en) | Control channel establishing method, forwarding point, and controller | |
| CN107547393A (en) | Method and network device for calculating forwarding path | |
| US9425893B1 (en) | Methods and apparatus for implementing optical integrated routing with traffic protection | |
| WO2009046644A1 (en) | A method and device for traffic flow switching | |
| WO2018058677A1 (en) | Message processing method, computing device, and message processing apparatus | |
| CN105591974A (en) | Message processing method, device and system | |
| CN102823205A (en) | Aggregating data traffic from access domains | |
| CN109691031B (en) | Service message transmission method and node device | |
| CN102025644B (en) | A load sharing method and device | |
| CN108206759A (en) | A kind of method to E-Packet, equipment and system | |
| CN102984058B (en) | Network communication method based on open stream, controller and exchangers | |
| KR20130032300A (en) | Fast lsp alert mechanism | |
| CN102238069A (en) | Data processing method and device for link switching process | |
| WO2017041577A1 (en) | Method and device for managing link aggregation member port packet loss | |
| CN105391647A (en) | Method and system for flow control | |
| US20220166721A1 (en) | Traffic balancing method, network device, and electronic device | |
| WO2018219103A1 (en) | Method and device for determining link state | |
| CN104702498A (en) | Method and device for reducing the number of optical connections through coordination protection | |
| CN103618630B (en) | A kind of data safe transmission method and equipment based on double up-links | |
| WO2016169214A1 (en) | Tunnel protection switching method and device |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
| WD01 | Invention patent application deemed withdrawn after publication | Application publication date:20120314 |