技术领域technical field
本发明涉及通信技术领域,具体涉及一种基于软件定义的无源光互连网结构及数据通信方法。The invention relates to the field of communication technology, in particular to a software-defined passive optical internet structure and a data communication method.
背景技术Background technique
光网络具有高带宽与低时延的优势,被广泛应用在骨干网、接入网与数据中心网络等场景中。以数据中心网络为例,传统的数据中心网络(DCN,Data Center Network)采用分层的树形拓扑结构,使用电以太网交换机作为交换节点。这种网络存在很多缺陷:1)等分带宽小,网络吞吐量收到网络拓扑以及根节点以太网交换机容量的限制;2)扩展性差,位于树形拓扑顶端的电交换机容量有限,成为网络带宽瓶颈限制网络的扩展性;3)能耗高,接入层、汇聚层以及核心层使用大量的电以太网交换机,它们消耗大量的能量而能耗问题是数据中心运营商不可忽视的关键问题。Optical networks have the advantages of high bandwidth and low latency, and are widely used in scenarios such as backbone networks, access networks, and data center networks. Taking a data center network as an example, a traditional data center network (DCN, Data Center Network) adopts a layered tree topology and uses an electrical Ethernet switch as a switching node. There are many defects in this kind of network: 1) the equal bandwidth is small, and the network throughput is limited by the network topology and the capacity of the root node Ethernet switch; Bottlenecks limit the scalability of the network; 3) High energy consumption. The access layer, aggregation layer, and core layer use a large number of electrical Ethernet switches, which consume a lot of energy, and energy consumption is a key issue that data center operators cannot ignore.
光互连网络能够解决传统数据中心网络带宽低、扩展性差以及能耗等问题。以数据中心网络为例,光互连技术被引入到数据中心网络中,Helios网络架构是该类型的代表。Helios架构是由电架顶交换机(ToR switch)和核心层交换机构成的两层结构。架顶交换机是常见的电分组交换机,核心层由电分组交换机和光电路交换机组成,从而构成混合光电数据中心网络结构。其中核心层电分组交换机用于架顶交换机之间的全通信,而光电路交换机用于架顶交换机之间高带宽缓慢变换流量(通常是长持续时间流量)的通信。架顶交换机与光电路交换机连接时采用波分复用技术将多个端口的不同载波波长的光信号复用到同一条光纤链路中进行传输,充分利用光纤的高带宽、大容量的特性提高数据中心网络性能。这一类光互连网络结构存在一些不足:1)分组时延大,光交换中使用广电路交换机,即基于微电机系统(MEMS)的交换机,当流量矩阵发生变化时,光电路交换机的重构过程非常耗时,通常为几十毫秒,从而导致网络时延比较长。2)网络能耗大,光电混合的网络架构中仍然有一部分电分组交换机,分组需要经过光-电-光转换,消耗大量的能量增加网络的能耗。The optical interconnection network can solve the problems of low bandwidth, poor scalability and energy consumption of traditional data center networks. Taking the data center network as an example, optical interconnection technology is introduced into the data center network, and the Helios network architecture is a representative of this type. The Helios architecture is a two-layer structure consisting of a top-of-rack switch (ToR switch) and a core layer switch. The top-of-rack switch is a common electrical packet switch, and the core layer is composed of an electrical packet switch and an optical circuit switch, thus forming a hybrid photoelectric data center network structure. Among them, the core layer electrical packet switch is used for full communication between top-of-rack switches, while the optical circuit switch is used for communication of high-bandwidth slowly changing traffic (usually long-duration traffic) between top-of-rack switches. When the top-of-rack switch is connected to the optical circuit switch, the wavelength division multiplexing technology is used to multiplex the optical signals of different carrier wavelengths of multiple ports into the same optical fiber link for transmission, making full use of the high bandwidth and large capacity of the optical fiber to improve Data center network performance. This type of optical interconnection network structure has some disadvantages: 1) The packet delay is large, and the optical circuit switch is used in the optical switching, that is, the switch based on the micro-electromechanical system (MEMS). When the traffic matrix changes, the weight of the optical circuit switch The construction process is very time-consuming, usually tens of milliseconds, resulting in relatively long network delay. 2) The network consumes a lot of energy. There are still some electric packet switches in the photoelectric hybrid network architecture, and the packets need to undergo photoelectric-electrical-optical conversion, which consumes a lot of energy and increases the energy consumption of the network.
为了进一步减少光互连网络结构的成本与能耗,无源光网络技术被广泛应用到光互连结构中。以数据中心网络为例,一些新型数据中心网络使用无源阵列波导光栅路由器(AWGR)或者无源光耦合器互连数据中心的机架。由于无源光器件无需电源供应便可工作,相比传统基于电以太网的网络大大降低了网络的能耗,这类网络典型的代表是POXN。为了调度通信流量提高网络吞吐量,POXN架构提出了一种分布式的媒体接入控制协议HEDA来协调机架内部服务器之间的通信。但是POXN架构也存在一些不足:1)分布式的接入控制协议会导致服务器之间频繁地交互控制信息,而POXN架构的数据与控制信息共享光链路资源,因此增加了控制平面的复杂程度;2)HEDA是一种分布式的接入控制协议,其需要在机架内的服务器之间进行同步,由于同一机架内的服务器距离较短,同步会对网络的控制平面提出很高的要求。In order to further reduce the cost and energy consumption of the optical interconnection network structure, the passive optical network technology is widely used in the optical interconnection structure. Taking data center networking as an example, some new data center networks use passive arrayed waveguide grating routers (AWGR) or passive optocouplers to interconnect racks in the data center. Because passive optical devices can work without power supply, the energy consumption of the network is greatly reduced compared with the traditional electrical Ethernet-based network. The typical representative of this type of network is POXN. In order to schedule communication traffic and improve network throughput, the POXN architecture proposes a distributed media access control protocol HEDA to coordinate communication between servers inside the rack. However, the POXN architecture also has some shortcomings: 1) The distributed access control protocol will lead to frequent exchange of control information between servers, while the data and control information of the POXN architecture share optical link resources, thus increasing the complexity of the control plane ; 2) HEDA is a distributed access control protocol, which needs to be synchronized between the servers in the rack. Since the distance between the servers in the same rack is relatively short, synchronization will pose a high challenge to the control plane of the network. Require.
综上所述,当前的光互连网络结构仍然存在以下问题:To sum up, the current optical interconnection network structure still has the following problems:
网络能耗大:一些应用场景中仍然采用光电混合的拓扑结构,使用电的以太网交换机。以数据中心网络为例,目前数据中心网络的汇聚层和核心层已经广泛使用光交换技术,但机架内网络仍然采用电分组交换机。在电分组交换机的端口上需要进行光-电-光的转换(O-E-O)消耗大量能量。需要采用光互连技术替代电架顶交换机,互连同一机架内的服务器,从而降低数据中心网络的能耗。同时光互连方案应该不高于电以太网交换机方案。High network energy consumption: In some application scenarios, a photoelectric hybrid topology is still used, using electric Ethernet switches. Taking the data center network as an example, optical switching technology has been widely used in the aggregation layer and core layer of the data center network, but the network in the rack still uses electrical packet switches. Optical-to-electrical-to-optical conversion (O-E-O) needs to be performed on the ports of the electrical packet switch, which consumes a lot of energy. It is necessary to use optical interconnection technology to replace the top-of-rack switch to interconnect servers in the same rack, thereby reducing the energy consumption of the data center network. At the same time, the optical interconnection scheme should not be higher than the electrical Ethernet switch scheme.
网络控制复杂:控制复杂是无源光网络中需要解决的难题。为了避免光波长碰撞,网络需要对波长资源进行调度、对网络资源进行分配。然而在光域内控制比较复杂,特别是在距离短、对时延有要求的场景下。传统的分布式控制机制需要各个终端节点之间进行协商,并由各个终端节点独立做出决策。网络控制的复杂具体体现在:频繁的控制信息交互减少网络的吞吐量;终端节点之间采用测距技术保持同步,对同步系统提出很高的要求;各个节点需要使用同一个公平有效的分布式网络资源分配算法。因此,光互连网络结构需要采用一种新的控制方式减少控制的复杂程度、同时提高控制的粒度。Complex network control: complex control is a difficult problem to be solved in the passive optical network. In order to avoid optical wavelength collisions, the network needs to schedule wavelength resources and allocate network resources. However, the control in the optical domain is more complicated, especially in scenarios where the distance is short and the delay is required. The traditional distributed control mechanism requires negotiation between each terminal node, and each terminal node makes a decision independently. The complexity of network control is embodied in: frequent control information interaction reduces network throughput; distance measurement technology is used to maintain synchronization between terminal nodes, which puts high demands on the synchronization system; each node needs to use the same fair and effective distributed Network resource allocation algorithm. Therefore, the optical interconnection network structure needs to adopt a new control method to reduce the complexity of control and improve the granularity of control at the same time.
分组端到端时延大:一些应用场景仍然采用光电混合的拓扑结构,一部分网络使用光互连技术,另一部分网络使用电以太网交换机。例如数据中心网络中,此时交换机采用存储-转发的工作方式,先将其收到的光信号转换为电信号存储下来,经过排队和处理后,再将电信号转换成光信号发送到下一跳节点。这一过程会导致分组的端到端时延大,很难满足一些应用(如云计算应用)的实时性需求。Large end-to-end packet delay: Some application scenarios still use a hybrid topology of optoelectronics, some networks use optical interconnection technology, and other networks use electrical Ethernet switches. For example, in the data center network, at this time, the switch adopts the store-and-forward working method, first converts the received optical signal into an electrical signal and stores it, and after queuing and processing, converts the electrical signal into an optical signal and sends it to the next jump node. This process will lead to a large end-to-end delay of the packet, which is difficult to meet the real-time requirements of some applications (such as cloud computing applications).
发明内容Contents of the invention
鉴于此,本发明的目的是提供一种基于软件定义的无源光互连网络结构及数据通信方法。In view of this, the object of the present invention is to provide a software-defined passive optical interconnection network structure and data communication method.
本发明的目的之一是通过以下技术方案实现的,One of the objectives of the present invention is achieved through the following technical solutions,
一种基于软件定义的无源光互连网络结构,包括终端节点、基于光耦合器的交换结构和软件定义的网络控制器,基于光耦合器的交换结构互连终端节点构成光数据通道;软件定义的网络控制器连接各个终端节点构成电的控制通道,软件定义的网络控制器收集网络状态信息并为终端节点动态分配网络资源。A software-defined passive optical interconnection network structure, including terminal nodes, an optocoupler-based switching fabric, and a software-defined network controller, wherein the optocoupler-based switching fabric interconnects terminal nodes to form an optical data channel; the software The defined network controller connects each terminal node to form an electrical control channel, and the software-defined network controller collects network status information and dynamically allocates network resources to the terminal nodes.
进一步,软件定义网络控制器采用软件定义的媒体接入控制机制,具体为:终端节点通过控制信道向软件定义网络控制器发送请求消息;软件定义网络控制器收集各个终端节点的请求消息后根据终端节点的需求分配网络资源,将分配结果通过控制信道告知各终端节点;终端节点使用相应的网络资源完成数据通信。Further, the software-defined network controller adopts a software-defined media access control mechanism, specifically: the terminal node sends a request message to the software-defined network controller through the control channel; the software-defined network controller collects the request message of each terminal node and then according to the Allocate network resources according to the needs of the nodes, and inform each terminal node of the allocation result through the control channel; the terminal nodes use the corresponding network resources to complete data communication.
进一步,软件定义的媒体接入控制机制采用最大-最小公平共享带宽分配算法为终端节点分配网络资源。Furthermore, the software-defined media access control mechanism uses the maximum-minimum fair sharing bandwidth allocation algorithm to allocate network resources for terminal nodes.
进一步,所述终端节点配置光源,光源可使用固定波长激光器或波长可调谐激光器,光源为终端节点提供光载波用于数据调制与传输。Further, the terminal node is configured with a light source, the light source can be a fixed-wavelength laser or a wavelength-tunable laser, and the light source provides the terminal node with an optical carrier for data modulation and transmission.
进一步,所述基于光耦合器的交换结构采用2×2光耦合器或者3×3光耦合器级联构成的Banyan络结构。Further, the switching structure based on optical couplers adopts a Banyan network structure formed by cascading 2×2 optical couplers or 3×3 optical couplers.
本发明的目的之二是通过以下技术方案实现的,一种基于软件定义的无源光互连网络结构的数据通信方法,包括以下步骤:The second object of the present invention is achieved through the following technical solutions, a data communication method based on a software-defined passive optical interconnection network structure, comprising the following steps:
信息注册:终端节点启动或者入网时会主动向软件定义的网络控制器发送“注册消息”,该消息包括:终端节点的IP地址、MAC地址和当前的缓存队列长度;软件定义网络控制器收到该“注册消息”后向终端节点发送“注册消息确认”表示注册成功;如若未收到,终端节点会继续向网络控制器注册;软件定义的网络控制器需要维持一个注册信息表,该注册信息表记录当前网络中处于工作状态的节点,软件定义网络控制器为注册信息表上的终端节点分配网络资源;Information registration: When the terminal node starts or connects to the network, it will actively send a "registration message" to the software-defined network controller. The message includes: the IP address, MAC address and current cache queue length of the terminal node; the software-defined network controller receives After the "registration message", send a "registration message confirmation" to the terminal node to indicate that the registration is successful; if not received, the terminal node will continue to register with the network controller; the software-defined network controller needs to maintain a registration information table, the registration information The table records the nodes in the current working state in the network, and the software-defined network controller allocates network resources for the terminal nodes on the registration information table;
带宽分配:在每一轮中终端节点向软件定义网络控制器发送请求消息,该请求消息中包含节点的缓存队列长度;软件定义的网络控制器收集了各个终端节点的请求消息后,采用最大-最小公平共享带宽分配算法为每个终端节点计算分配的资源量;在分配结束后,软件定义网络控制器通过控制信道向终端节点发送“授权消息”,授权消息中包含了本轮带宽分配的结果;Bandwidth allocation: In each round, the terminal node sends a request message to the software-defined network controller, which contains the length of the cache queue of the node; after the software-defined network controller collects the request messages of each terminal node, it uses the maximum- The least fair shared bandwidth allocation algorithm calculates the amount of allocated resources for each terminal node; after the allocation is completed, the software-defined network controller sends an "authorization message" to the terminal node through the control channel, and the authorization message contains the result of the current round of bandwidth allocation ;
数据传输:各个终端节点收到授权消息后根据软件定义网络控制器的授权进行数据的发送,每个终端节点在其相应的时间段进行数据的发送从而避免网络碰撞;各个终端节点在自身时隙结束后向软件定义网络控制器发送“报告消息”,通知软件定义网络控制器其数据发送完成并告知软件定义的网络控制器当前的缓存队列长度信息;软件定义网络控制器更新缓存队列长度值作为下一个周期网络资源分配的依据。Data transmission: After receiving the authorization message, each terminal node sends data according to the authorization of the software-defined network controller, and each terminal node sends data in its corresponding time period to avoid network collision; each terminal node transmits data in its own time slot After the end, send a "report message" to the software-defined network controller to notify the software-defined network controller that its data transmission is complete and inform the software-defined network controller of the current cache queue length information; the software-defined network controller updates the cache queue length value as The basis for network resource allocation in the next cycle.
由于采用了上述技术方案,本发明具有如下的优点:Owing to adopting above-mentioned technical scheme, the present invention has following advantage:
1)本发明结构可以降低网络的成本。将本方案应用到数据中心机架内网络中,对该网络的成本进行数值分析,与现有采用电分组架顶交换机(ToR Switch)的数据中心网络相比,本发明网络结构的成本要更小。1) The structure of the present invention can reduce the cost of the network. Applying this solution to the network in the rack of the data center, the cost of the network is analyzed numerically. Compared with the existing data center network using the electrical packet top-of-rack switch (ToR Switch), the cost of the network structure of the present invention is lower. Small.
2)本发明结构可以降低网络的能耗。将本方案应用到数据中心机架内网络中,对该网络的能耗进行数值分析,与现有采用电分组架顶交换机的数据中心网络相比,本发明网络结构的能耗要更小。2) The structure of the present invention can reduce the energy consumption of the network. This solution is applied to the network in the rack of the data center, and the energy consumption of the network is numerically analyzed. Compared with the existing data center network using the electrical packet top-of-rack switch, the energy consumption of the network structure of the present invention is smaller.
3)本发明结构可以降低终端节点之间的分组端到端时延。将本方案应用到数据中心机架内网络中,对该网络的分组端到端时延进行仿真分析,与现有采用电分组架顶交换机的数据中心网络相比,本发明网络结构终端节点之间的分组端到端时延更小。3) The structure of the present invention can reduce the packet end-to-end delay between terminal nodes. Applying this solution to the network in the rack of the data center, the end-to-end delay of the grouping of the network is simulated and analyzed. The end-to-end delay between packets is smaller.
4)本发明结构可以提高网络的可靠性。方案中使用无源光器件,与传统的以太网交换机相比,网络结构具有更高的可靠性。4) The structure of the present invention can improve the reliability of the network. The scheme uses passive optical devices, and compared with traditional Ethernet switches, the network structure has higher reliability.
5)本发明结构采用软件定义的媒体接入控制机制,该机制采用最大-最小公平共享带宽分配算法保证网络资源公平地分配给网络中各终端节点;同时,与传统的固定时分复用方案相比,这种算法能够提高网络的吞吐量。5) The structure of the present invention adopts a software-defined media access control mechanism, which uses a maximum-minimum fair sharing bandwidth allocation algorithm to ensure that network resources are fairly distributed to each terminal node in the network; meanwhile, it is different from the traditional fixed time division multiplexing scheme Compared, this algorithm can improve the throughput of the network.
附图说明Description of drawings
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步的详细描述,其中:In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings, wherein:
图1为基于软件定义的无源光互连网络结构;Figure 1 is a software-defined passive optical interconnection network structure;
图2为基于软件定义的无源光互连网络结构在数据中心机架内网络的应用;Figure 2 is the application of the software-defined passive optical interconnection network structure in the data center rack network;
图3为软件定义的媒体接入控制机制;FIG. 3 is a software-defined media access control mechanism;
图4为SD-POIN方案与电以太网交换机方案的组网成本比较;Figure 4 is a comparison of the networking costs of the SD-POIN solution and the electrical Ethernet switch solution;
图5为SD-POIN方案与电以太网交换机方案的组网能耗比较;Figure 5 is a comparison of networking energy consumption between the SD-POIN solution and the electrical Ethernet switch solution;
图6为SD-POIN方案与电以太网交换机方案的分组端到端时延比较;Figure 6 is a comparison of packet end-to-end delay between the SD-POIN solution and the electrical Ethernet switch solution;
图7为最大-最小公平共享带宽分配算法与固定时分复用算法的网络吞吐量比较;Fig. 7 is the network throughput comparison of the maximum-minimum fair sharing bandwidth allocation algorithm and the fixed time division multiplexing algorithm;
图8为软件定义的媒体接入控制机制总体结构。Figure 8 shows the overall structure of the software-defined media access control mechanism.
具体实施方式detailed description
以下将结合附图,对本发明的优选实施例进行详细的描述;应当理解,优选实施例仅为了说明本发明,而不是为了限制本发明的保护范围。The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings; it should be understood that the preferred embodiments are only for illustrating the present invention, rather than limiting the protection scope of the present invention.
一种基于软件定义的无源光互连网络结构(SD-POIN,Software Defined Passive OpticalInterconnection Networks),包括终端节点、基于光耦合器的交换结构和软件定义的网络控制器,基于光耦合器的交换结构互连终端节点构成光数据通道;软件定义网络控制器连接各终端节点构成电的控制通道,软件定义网络控制器收集网络状态信息并为终端节点动态分配网络资源。A software-defined passive optical interconnection network structure (SD-POIN, Software Defined Passive Optical Interconnection Networks), including terminal nodes, an optocoupler-based switching structure and a software-defined network controller, an optocoupler-based switch The structural interconnection of terminal nodes constitutes an optical data channel; the software-defined network controller connects each terminal node to form an electrical control channel, and the software-defined network controller collects network status information and dynamically allocates network resources for terminal nodes.
终端节点需要配置光源,光源为终端节点提供光载波用于数据调制与传输。某终端节点可以选择固定波长激光器或者波长可调谐激光器作为光源。选择固定波长激光器,所有的终端节点都使用相同的波长,则不同的终端节点需要在不同的时刻使用该波长进行通信,否则波长会发生碰撞;选择波长可调谐激光器,终端节点可以使用不同的波长进行通信,网络资源的分配更为灵活。考虑到固定波长激光器比波长可调谐激光器的成本低,本发明中终端节点光源优先考虑采用相同的、固定波长激光器作为光源。The terminal node needs to be equipped with a light source, and the light source provides the terminal node with an optical carrier for data modulation and transmission. A terminal node can choose a fixed-wavelength laser or a wavelength-tunable laser as a light source. If you choose a fixed wavelength laser, all terminal nodes use the same wavelength, then different terminal nodes need to use this wavelength for communication at different times, otherwise the wavelength will collide; if you choose a wavelength tunable laser, the terminal nodes can use different wavelengths For communication, the allocation of network resources is more flexible. Considering that the cost of fixed-wavelength lasers is lower than that of wavelength-tunable lasers, in the present invention, the same fixed-wavelength lasers are preferably used as light sources for the terminal node light source.
基于光耦合器的交换结构:光耦合器是一种无源光器件,无需电源供应便可工作。因此,基于耦合器的光交换网络能够降低网络的能耗。光耦合器将接收到的光信号广播出去,十分适合Multicast与Incast的流量模式。商用的光耦合器主要是2×2耦合器和3×3耦合器,为了在网络中互连更多的终端节点,本发明中采用2×2光耦合器或者3×3光耦合器级联构成的Banyan网络结构。Optocoupler-Based Switch Fabric: An optocoupler is a passive optical device that does not require a power supply to operate. Therefore, the optical switching network based on the coupler can reduce the energy consumption of the network. The optical coupler broadcasts the received optical signal, which is very suitable for the traffic mode of Multicast and Incast. Commercial optocouplers are mainly 2×2 couplers and 3×3 couplers. In order to interconnect more terminal nodes in the network, 2×2 optocouplers or 3×3 optocouplers are cascaded in the present invention Formed Banyan network structure.
软件定义的网络控制器:软件定义的网络控制器通过电以太网连接到各个终端节点构成控制通道。由于终端节点使用相同的波长进行通信,若多个终端节点同时向网络发送数据流,网络会出现碰撞。因此,控制器需要解决以下两个方面的技术问题:1)接入控制,软件定义的网络控制器采用时分复用方式分配网络资源,允许不同的终端节点在不同的时刻向网络中的其它终端节点发送数据,从而避免网络碰撞;2)动态带宽分配,软件定义网络控制器需要收集各个节点的带宽需求,按照需求动态分配带宽从而提高网络吞吐量,同时网络资源的分配应是公平的。Software-defined network controller: The software-defined network controller is connected to each terminal node through electrical Ethernet to form a control channel. Since the end nodes use the same wavelength for communication, if multiple end nodes send data streams to the network at the same time, collisions will occur in the network. Therefore, the controller needs to solve the technical problems in the following two aspects: 1) access control, the software-defined network controller uses time division multiplexing to allocate network resources, allowing different terminal nodes to send data to other terminals in the network at different times Nodes send data to avoid network collisions; 2) Dynamic bandwidth allocation, the software-defined network controller needs to collect the bandwidth requirements of each node, and dynamically allocate bandwidth according to the requirements to improve network throughput. At the same time, the allocation of network resources should be fair.
软件定义的网络控制器采用软件定义的媒体接入控制机制(SD-MAC,Software DefinedMedium Access Control),该机制的目标是为各个终端节点公平地分配网络资源,从而避免网络发生碰撞,提高网络的吞吐量。该机制具体为:终端节点通过控制信道向软件定义网络控制器发送请求消息;软件定义的网络控制器收集各个终端节点的请求消息后根据终端节点的需求分配网络资源,将分配结果通过控制信道告知各终端节点;终端节点使用相应的网络资源完成数据通信。The software-defined network controller adopts the software-defined media access control mechanism (SD-MAC, Software Defined Medium Access Control). throughput. The mechanism is specifically: the terminal node sends a request message to the software-defined network controller through the control channel; the software-defined network controller collects the request message of each terminal node, allocates network resources according to the needs of the terminal node, and notifies the distribution result through the control channel Each terminal node; the terminal node uses the corresponding network resources to complete data communication.
软件定义的媒体接入控制机制采用最大-最小公平共享带宽分配算法为终端节点分配网络资源。通过该算法,网络资源会公平地分配给需要通信的终端节点,一个终端节点获得的资源不会多于其请求的资源,同时请求未得到满足的终端节点所分配的资源将多于其他请求已经得到满足的终端节点,从而保联网络资源的利用效率。The software-defined media access control mechanism uses the maximum-minimum fair sharing bandwidth allocation algorithm to allocate network resources for terminal nodes. Through this algorithm, network resources will be fairly allocated to terminal nodes that need to communicate, and a terminal node will not obtain more resources than it requests, and at the same time, terminal nodes whose requests have not been satisfied will allocate more resources than other requests have already received. Satisfied terminal nodes, thus ensuring the utilization efficiency of network resources.
本发明还提供了一种基于软件定义的无源光互连网络结构的数据通信方法,包括:The present invention also provides a data communication method based on a software-defined passive optical interconnection network structure, including:
信息注册:终端节点启动或者入网时会主动向软件定义的网络控制器发送“注册消息”,该消息包括:终端节点的IP地址、MAC地址和当前的缓存队列长度;软件定义网络控制器收到该“注册消息”后向终端节点发送“注册消息确认”表示注册成功;如若未收到,终端节点会继续向软件定义的网络控制器注册;软件定义网络控制器需要维持一个注册信息表(RIT,RoutingInformation Table),该注册信息表记录当前网络中处于工作状态的节点,软件定义网络控制器为注册信息表上的终端节点分配网络资源。由于软件定义网络控制器只会为注册信息表上的节点分配网络资源,因此,当某个终端节点的地址信息发生变化或者进入休眠、关机状态时,该终端节点需要主动向软件定义的网络控制器发送“注册更新消息”,软件定义的网络控制器在更新注册信息表后确认该行为。Information registration: When the terminal node starts or connects to the network, it will actively send a "registration message" to the software-defined network controller. The message includes: the IP address, MAC address and current cache queue length of the terminal node; the software-defined network controller receives After the "registration message", a "registration message confirmation" is sent to the terminal node to indicate that the registration is successful; if not received, the terminal node will continue to register with the software-defined network controller; the software-defined network controller needs to maintain a registration information table (RIT ,RoutingInformation Table), the registration information table records the nodes in the current working state in the network, and the software-defined network controller allocates network resources for the terminal nodes on the registration information table. Since the software-defined network controller will only allocate network resources to the nodes on the registration information table, when the address information of a terminal node changes or enters a dormant or shutdown state, the terminal node needs to actively report to the software-defined network controller. The controller sends a "registration update message", and the software-defined network controller confirms this behavior after updating the registration information table.
带宽分配:在每一轮中终端节点向软件定义网络控制器发送请求消息,该请求消息中包含节点的缓存队列长度;软件定义的网络控制器收集了各个终端节点的请求消息后,采用最大-最小公平共享带宽分配算法为每个终端节点计算分配的资源量;在分配结束后,软件定义网络控制器通过控制信道向终端节点发送“授权消息”,授权消息中包含了本轮带宽分配的结果;Bandwidth allocation: In each round, the terminal node sends a request message to the software-defined network controller, which contains the length of the cache queue of the node; after the software-defined network controller collects the request messages of each terminal node, it uses the maximum- The least fair shared bandwidth allocation algorithm calculates the amount of allocated resources for each terminal node; after the allocation is completed, the software-defined network controller sends an "authorization message" to the terminal node through the control channel, and the authorization message contains the result of the current round of bandwidth allocation ;
数据传输:各个终端节点收到授权消息后根据软件定义的网络控制器的授权进行数据的发送,每个终端节点在其相应的时间段进行数据的发送从而避免网络碰撞;各终端节点在自身时隙结束后向软件定义网络控制器发送“报告消息”,通知软件定义网络控制器其数据发送完成并告知软件定义的网络控制器当前的缓存队列长度信息;软件定义的网络控制器更新缓存队列长度值作为下一个周期网络资源分配的依据。Data transmission: After receiving the authorization message, each terminal node sends data according to the authorization of the software-defined network controller, and each terminal node sends data in its corresponding time period to avoid network collision; each terminal node After the end of the slot, send a "report message" to the software-defined network controller to notify the software-defined network controller that its data transmission is complete and inform the software-defined network controller of the current buffer queue length information; the software-defined network controller updates the buffer queue length The value is used as the basis for network resource allocation in the next cycle.
本发明“基于软件定义的无源光互连网络结构及数据通信方法”在实践中可以部署在数据中心机架内网络中。我们以数据中心网络为例对SD-POIN结构与SD-MAC机制进行说明。图2展示了SD-POIN结构在数据中心机架内的应用,数据中心中机架内的服务器通过光耦合器结构互连,软件定义控制器调度服务器机架内的数据通信。如图3所示,数据中心中服务器从接入到进行数据通信的流程包括:The "software-defined passive optical interconnection network structure and data communication method" of the present invention can be deployed in a network in a data center rack in practice. We take the data center network as an example to illustrate the SD-POIN structure and SD-MAC mechanism. Figure 2 shows the application of the SD-POIN structure in the data center rack. The servers in the rack in the data center are interconnected through the optocoupler structure, and the software-defined controller schedules the data communication in the server rack. As shown in Figure 3, the process from server access to data communication in the data center includes:
步骤301:服务器启动或入网时,主动向软件定义控制器发送“注册信息”,该信息包括:服务器及虚拟机IP地址,当前的缓存队列长度等。控制器接收到该“注册信息”后,向服务器发送“注册确认消息”表示服务器注册成功。若未收到,服务器继续注册。控制器维持一个注册信息表(RIT),若某台服务器注册成功,则在该表中添加相关表项。控制器会为注册信息表中的表项分配网络资源。Step 301: When the server starts or connects to the network, it actively sends "registration information" to the software-defined controller, and the information includes: server and virtual machine IP addresses, current cache queue length, etc. After receiving the "registration information", the controller sends a "registration confirmation message" to the server to indicate that the server registration is successful. If not received, the server continues to register. The controller maintains a registration information table (RIT), and if a certain server registers successfully, then add relevant entries in the table. The controller will allocate network resources for entries in the registration information table.
步骤302:软件定义控制器收集各台服务器的请求消息,请求消息中包含各台服务器的缓存队列长度,控制器采用最大-最小公平共享带宽分配算法为每台服务器分配带宽资源。然后向机架内的服务器发送授权消息,授权消息记录了带宽分配的结果。服务器使用相应的带宽资源进行机架内的数据发送。Step 302: The software-defined controller collects the request messages of each server, the request messages include the buffer queue length of each server, and the controller uses the maximum-minimum fair shared bandwidth allocation algorithm to allocate bandwidth resources for each server. Then send an authorization message to the server in the rack, and the authorization message records the result of bandwidth allocation. The server uses corresponding bandwidth resources to send data within the rack.
步骤303:服务器收到控制器发送的授权消息后,在控制器指定的时间段进行数据的发送。在同一时刻,至多一台服务器向机架内发送数据,因此可以避免网络碰撞。服务器结束数据发送后需要向控制器发送报告消息,该报告消息是为了告知控制器自己不再使用机架内的网络资源;同时,服务器还需要将自身当前的缓存队列长度告知控制器,控制器更新服务器的缓存队列长度信息作为下一个周期各台服务器带宽分配的依据。Step 303: After receiving the authorization message sent by the controller, the server sends data within the time period specified by the controller. At the same time, at most one server sends data to the rack, so network collisions can be avoided. After the server finishes sending data, it needs to send a report message to the controller. The report message is to inform the controller that it no longer uses the network resources in the rack; at the same time, the server also needs to inform the controller of its current buffer queue length, and the controller Update the cache queue length information of the server as the basis for the bandwidth allocation of each server in the next cycle.
下文中将结合附图8对本发明中软件定义的媒体接入控制机制在一个周期内的实施进行详细说明。The implementation of the software-defined media access control mechanism in the present invention in one cycle will be described in detail below with reference to FIG. 8 .
在以下实施例中,服务器与软件定义控制器通过以太网连接,控制信息通过以太网链路传输到软件定义控制器。In the following embodiments, the server and the software-defined controller are connected through Ethernet, and the control information is transmitted to the software-defined controller through the Ethernet link.
步骤801:一个周期开始的时候,软件定义控制器通过以太网向机架内广播“授权信息”,机架内的所有服务器都能收到该授权信息。收到授权信息后,服务器会根据地址信息找到相对应的时间片。Step 801: At the beginning of a cycle, the software-defined controller broadcasts "authorization information" to the rack via Ethernet, and all servers in the rack can receive the authorization information. After receiving the authorization information, the server will find the corresponding time slice according to the address information.
步骤802:服务器1先使用机架内网络资源。服务器1打开激光器向同一机架内的服务器N发送消息。Step 802: Server 1 first uses network resources in the rack. Server 1 turns on the laser to send a message to server N in the same rack.
步骤803:光信号到达基于耦合器的光互连结构,光耦合器切割光信号并将信号广播给机架内的所有服务器。服务器2和服务器N都会接收到该光信号,将光信号转换为电信号后,服务器会查看分组头部的目的地址。服务器2发现该分组并不属于自己,于是丢弃分组;服务器N发现该分组属于自己,于是接收分组。Step 803: The optical signal reaches the optical interconnection structure based on the coupler, and the optical coupler cuts the optical signal and broadcasts the signal to all servers in the rack. Both server 2 and server N will receive the optical signal, and after converting the optical signal into an electrical signal, the server will check the destination address of the packet header. Server 2 finds that the packet does not belong to itself, so it discards the packet; server N finds that the packet belongs to itself, so it receives the packet.
步骤804:服务器1的时间片结束,服务器1停止向机架内发送数据。服务器1向软件定义控制器发送报告消息,告知软件定义控制器其已经结束数据的发送,并且将自己当前的缓存队列长度报告给软件定义控制器,作为下一个周期带宽资源分配的依据。Step 804: the time slice of server 1 ends, and server 1 stops sending data to the rack. Server 1 sends a report message to the software-defined controller, informing the software-defined controller that it has finished sending data, and reports its current buffer queue length to the software-defined controller as the basis for bandwidth resource allocation in the next cycle.
步骤805:服务器2打开激光器,开始向机架内的其它服务器发送数据。Step 805: Server 2 turns on the laser and starts sending data to other servers in the rack.
步骤806:重复以上的步骤直到软件定义控制器注册信息表上的所有服务器都向机架内的网络发送了一次数据。至此,软件定义控制器已经更新了所有服务器的缓存队列长度,其可以采用最大-最小公平共享带宽分配算法为每台服务器计算下一周期分配的带宽资源。Step 806: Repeat the above steps until all the servers in the registration information table of the software-defined controller have sent data to the network in the rack once. So far, the software-defined controller has updated the buffer queue lengths of all servers, and it can use the maximum-minimum fair shared bandwidth allocation algorithm to calculate the bandwidth resources allocated for each server in the next period.
本实例实现了一个调度周期内的机架内网络的数据传输。This example realizes the data transmission of the intra-rack network within a scheduling period.
将本发明“基于软件定义的无源光互连网络结构及数据通信方法”应用到数据中心的机架内部网络中。传统的数据中心网络使用电以太网交换(也称为架顶交换机,ToR)互连同一机架内的服务器,现在采用基于软件定义的无源光互连网络结构互连同一机架内服务器,通过SD-MAC机制调度机架内服务器的通信。本发明与传统的电架顶交换机互连方案相比,具有以下优势:Apply the "software-defined passive optical interconnection network structure and data communication method" of the present invention to the rack internal network of the data center. The traditional data center network uses electrical Ethernet switching (also known as top-of-rack switch, ToR) to interconnect servers in the same rack, and now uses a software-defined passive optical interconnection network structure to interconnect servers in the same rack. The communication of servers in the rack is scheduled through the SD-MAC mechanism. Compared with the traditional top-of-rack switch interconnection scheme, the present invention has the following advantages:
1)本发明结构可以降低数据中心网络的成本。对该网络结构的成本进行数值分析,与现有采用电分组架顶交换机(ToR Switch)的数据中心网络相比,本发明网络结构的成本要更小。图4展示了随着机架内服务器数目增加,SD-POIN方案与架顶交换机方案成本都随之增加,而SD-POIN方案的成本增加速率低于架顶交换机方案;而当服务器数量一定时,架顶交换机方案的成本高于SD-POIN方案。1) The structure of the present invention can reduce the cost of the data center network. Numerical analysis of the cost of the network structure shows that the cost of the network structure of the present invention is smaller than that of the existing data center network using an electrical packet top-of-rack switch (ToR Switch). Figure 4 shows that as the number of servers in the rack increases, the costs of both the SD-POIN solution and the top-of-rack switch solution increase, while the cost increase rate of the SD-POIN solution is lower than that of the top-of-rack switch solution; and when the number of servers is constant , the cost of the top-of-rack switch solution is higher than that of the SD-POIN solution.
2)本发明结构可以降低数据中心网络的能耗。对该网络结构的能耗进行数值分析,与现有采用电分组架顶交换机的数据中心网络相比,本发明网络结构的能耗要更小。图5展示了随着机架内服务器数目增加,架顶交换机方案的能耗随之增加,且增加速率较快,而SD-POIN随服务器的增加能耗增加缓慢。主要是因为SD-POIN方案使用的光耦合器是无源光器件,可以在没有电源供应下工作。2) The structure of the present invention can reduce the energy consumption of the data center network. Numerical analysis of the energy consumption of the network structure shows that the energy consumption of the network structure of the present invention is smaller than that of the existing data center network using electrical packet top-of-rack switches. Figure 5 shows that as the number of servers in the rack increases, the energy consumption of the top-of-rack switch solution increases, and the increase rate is relatively fast, while the energy consumption of SD-POIN increases slowly with the increase of servers. The main reason is that the optocoupler used in the SD-POIN solution is a passive optical device that can work without power supply.
3)本发明结构可以降低数据中心网络中机架内部服务器之间的分组端到端时延。图6比较了SD-POIN方案与架顶交换机方案机架内部服务器之间的分组端到端时延。结果表明,随着网络负载的增加,两种方案的端到端时延都在增加,但SD-POIN方案的分组端到端时延增长速率缓慢,架顶交换机方案的分组端到端时延增长速率较快。而SD-POIN的分组端到端时延要比分组在架顶交换机中小,因此,本发明方案在时延性能上具有明显的优势。3) The structure of the present invention can reduce the packet end-to-end delay between servers inside the rack in the data center network. Figure 6 compares the packet end-to-end delay between the SD-POIN solution and the top-of-rack switch solution between servers inside the rack. The results show that as the network load increases, the end-to-end delay of the two schemes increases, but the packet end-to-end delay of the SD-POIN scheme increases slowly, and the packet end-to-end delay of the top-of-rack switch scheme The growth rate is faster. However, the end-to-end time delay of SD-POIN grouping is smaller than that of grouping in the top-of-rack switch. Therefore, the solution of the present invention has obvious advantages in time delay performance.
4)本发明结构采用软件定义的媒体接入控制机制,该机制采用最大-最小公平共享带宽分配算法保证网络资源公平地分配给机架内所有的服务器;同时,与传统的固定时分复用方案相比,这种算法能够提高机架内网络的吞吐量。图7比较了使用这两种带宽分配算法时的机架内网络吞吐量。随着负载的增加基于两种带宽分配算法的网络吞吐量都随之增长,当网络负载到达0.6时,基于固定时分复用的网络吞吐量低于最大-最小公平共享带宽分配算法下的网络吞吐量。因此,本发明提出的软件定义的媒体接入控制机制采用最大-最小公平共享带宽分配算法提升机架内网络的吞吐量,同时保证带宽分配的公平性。4) The structure of the present invention adopts a software-defined media access control mechanism, which adopts a maximum-minimum fair sharing bandwidth allocation algorithm to ensure that network resources are fairly allocated to all servers in the rack; meanwhile, it is different from the traditional fixed time division multiplexing scheme This algorithm improves the throughput of the in-rack network compared to Figure 7 compares the in-rack network throughput when using these two bandwidth allocation algorithms. As the load increases, the network throughput based on the two bandwidth allocation algorithms increases. When the network load reaches 0.6, the network throughput based on fixed time division multiplexing is lower than that under the maximum-minimum fair share bandwidth allocation algorithm. quantity. Therefore, the software-defined media access control mechanism proposed by the present invention adopts the maximum-minimum fair sharing bandwidth allocation algorithm to improve the throughput of the network in the rack while ensuring the fairness of bandwidth allocation.
以上所述仅为本发明的优选实施例,并不用于限制本发明,显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies thereof, the present invention also intends to include these modifications and variations.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610546377.6ACN105959163A (en) | 2016-07-12 | 2016-07-12 | Passive optical interconnection network structure based on software definition and data communication method |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610546377.6ACN105959163A (en) | 2016-07-12 | 2016-07-12 | Passive optical interconnection network structure based on software definition and data communication method |
| Publication Number | Publication Date |
|---|---|
| CN105959163Atrue CN105959163A (en) | 2016-09-21 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610546377.6APendingCN105959163A (en) | 2016-07-12 | 2016-07-12 | Passive optical interconnection network structure based on software definition and data communication method |
| Country | Link |
|---|---|
| CN (1) | CN105959163A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106572401A (en)* | 2016-10-11 | 2017-04-19 | 烽火通信科技股份有限公司 | Optical access network system based on software definition and realization method thereof |
| CN106789750A (en)* | 2017-01-19 | 2017-05-31 | 西安电子科技大学 | A kind of high-performance calculation interconnected network system and communication means |
| CN108683961A (en)* | 2018-05-15 | 2018-10-19 | 重庆大学 | In-rack Optical Interconnection Network of Data Center Based on Wavelength Selective Switch |
| CN113949948A (en)* | 2020-07-16 | 2022-01-18 | 慧与发展有限责任合伙企业 | Optical network with combined circuit-packet-switched architecture |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102158411A (en)* | 2006-08-21 | 2011-08-17 | 丛林网络公司 | Multi-chassis router with multiplexed optical interconnects |
| CN103441942A (en)* | 2013-08-26 | 2013-12-11 | 重庆大学 | Data center network system and data communication method based on software definition |
| CN104767694A (en)* | 2015-04-08 | 2015-07-08 | 大连理工大学 | A Data Stream Forwarding Method Oriented to Fat-Tree Data Center Network Architecture |
| WO2015180114A1 (en)* | 2014-05-30 | 2015-12-03 | 华为技术有限公司 | Optical interconnection device, optical interconnection system and communication method for optical interconnection device |
| CN105162721A (en)* | 2015-07-31 | 2015-12-16 | 重庆大学 | All-optical interconnection data center network system based on software defined network and data communication method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102158411A (en)* | 2006-08-21 | 2011-08-17 | 丛林网络公司 | Multi-chassis router with multiplexed optical interconnects |
| CN103441942A (en)* | 2013-08-26 | 2013-12-11 | 重庆大学 | Data center network system and data communication method based on software definition |
| WO2015180114A1 (en)* | 2014-05-30 | 2015-12-03 | 华为技术有限公司 | Optical interconnection device, optical interconnection system and communication method for optical interconnection device |
| CN104767694A (en)* | 2015-04-08 | 2015-07-08 | 大连理工大学 | A Data Stream Forwarding Method Oriented to Fat-Tree Data Center Network Architecture |
| CN105162721A (en)* | 2015-07-31 | 2015-12-16 | 重庆大学 | All-optical interconnection data center network system based on software defined network and data communication method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106572401A (en)* | 2016-10-11 | 2017-04-19 | 烽火通信科技股份有限公司 | Optical access network system based on software definition and realization method thereof |
| CN106572401B (en)* | 2016-10-11 | 2019-08-06 | 烽火通信科技股份有限公司 | A kind of optical access network system and its implementation based on software definition |
| CN106789750A (en)* | 2017-01-19 | 2017-05-31 | 西安电子科技大学 | A kind of high-performance calculation interconnected network system and communication means |
| CN106789750B (en)* | 2017-01-19 | 2019-07-16 | 西安电子科技大学 | A high-performance computing interconnection network system and communication method |
| CN108683961A (en)* | 2018-05-15 | 2018-10-19 | 重庆大学 | In-rack Optical Interconnection Network of Data Center Based on Wavelength Selective Switch |
| CN113949948A (en)* | 2020-07-16 | 2022-01-18 | 慧与发展有限责任合伙企业 | Optical network with combined circuit-packet-switched architecture |
| CN113949948B (en)* | 2020-07-16 | 2022-09-06 | 慧与发展有限责任合伙企业 | Optical Network with Combined Circuit Packet Switching Architecture |
| US11553260B2 (en) | 2020-07-16 | 2023-01-10 | Hewlett Packard Enterprise Development Lp | Optical network having combined circuit-packet switch architecture |
| Publication | Publication Date | Title |
|---|---|---|
| CN105162721B (en) | Full light network data centre network system and data communications method based on software defined network | |
| CN107770091B (en) | Power fiber-to-the-home bandwidth allocation method and device | |
| CN104618207A (en) | Heterogeneous FC-AE-1553 network system and exchange method | |
| Gu et al. | Software defined flexible and efficient passive optical networks for intra-datacenter communications | |
| Imran et al. | Software-defined optical burst switching for HPC and cloud computing data centers | |
| CN100574280C (en) | The implementation method of differentiated service in the light burst exchange network | |
| CN105959163A (en) | Passive optical interconnection network structure based on software definition and data communication method | |
| Imran et al. | Performance evaluation of hybrid optical switch architecture for data center networks | |
| Zheng et al. | Dual MAC based hierarchical optical access network for hyperscale data centers | |
| JP5739960B2 (en) | Method and system for providing external optical data packets to a destination node of a packet optical network | |
| CN109302350B (en) | A Scheduling Method Based on Optical Multicast Hybrid Network Architecture | |
| Samadi et al. | Virtual machine migration over optical circuit switching network in a converged inter/intra data center architecture | |
| Baziana et al. | Collision-free distributed MAC protocol for passive optical intra-rack data center networks | |
| EP3038279B1 (en) | Bandwidth map update method and device | |
| Hammadi | Future PON data centre networks | |
| CN118175462B (en) | FTTR device management method, system, network device and storage medium | |
| Chunming et al. | Polymorphic Control for Cost‐Effective Design of Optical Networks | |
| Cai et al. | Software defined passive optical intra-rack networks in data centers | |
| CN108683961A (en) | In-rack Optical Interconnection Network of Data Center Based on Wavelength Selective Switch | |
| Fiorani et al. | Large data center interconnects employing hybrid optical switching | |
| Mehrotra et al. | Network processor design for optical burst switched networks | |
| Baziana | i‐WABA: An efficient wavelengths arrangement and bandwidth allocation WDMA protocol for passive optical intra‐rack data center networks | |
| Guo et al. | Experimental demonstration of SDN-enabled reconfigurable disaggregated data center infrastructure | |
| CN109698982B (en) | Control channel implementation method, device, equipment, storage medium and processing method | |
| Chakraborty et al. | An architecture to improve performance of software-defined optical networks |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20160921 | |
| RJ01 | Rejection of invention patent application after publication |