Movatterモバイル変換


[0]ホーム

URL:


CN110986979A - A Reinforcement Learning-Based Multipath Routing Planning Method for SDN - Google Patents

A Reinforcement Learning-Based Multipath Routing Planning Method for SDN
Download PDF

Info

Publication number
CN110986979A
CN110986979ACN201911183909.4ACN201911183909ACN110986979ACN 110986979 ACN110986979 ACN 110986979ACN 201911183909 ACN201911183909 ACN 201911183909ACN 110986979 ACN110986979 ACN 110986979A
Authority
CN
China
Prior art keywords
flow
reinforcement learning
routing
sdn
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911183909.4A
Other languages
Chinese (zh)
Other versions
CN110986979B (en
Inventor
李传煌
方春涛
卢正勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang UniversityfiledCriticalZhejiang Gongshang University
Priority to CN201911183909.4ApriorityCriticalpatent/CN110986979B/en
Publication of CN110986979ApublicationCriticalpatent/CN110986979A/en
Application grantedgrantedCritical
Publication of CN110986979BpublicationCriticalpatent/CN110986979B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开一种基于强化学习的SDN多路径路由规划方法,该方法为:将强化学习应用于SDN多路径路由规划中,使用QLearning算法作为强化学习模型,根据流量的不同QoS等级,产生不同的奖励值;根据输入的网络拓扑矩阵、当前待转发流特征矩阵,为不同的QoS等级的流设置不同的奖励函数,规划出多条路径转发该流;并在链路带宽不够用的情况下,将一条较大的流划分为多条小流量,从而提高链路带宽利用率。本发明利用强化学习与环境不断交互、调整策略的特点,相比于传统的单路径路由规划,可以实现高链路利用率,能有效减少网络拥塞。

Figure 201911183909

The invention discloses an SDN multi-path routing planning method based on reinforcement learning. The method is as follows: applying reinforcement learning to SDN multi-path routing planning, using QLearning algorithm as a reinforcement learning model, and generating different QoS levels according to different traffic levels. Reward value; according to the input network topology matrix and the current feature matrix of the flow to be forwarded, set different reward functions for flows of different QoS levels, and plan multiple paths to forward the flow; and when the link bandwidth is not enough, Divide a larger flow into multiple smaller flows to improve link bandwidth utilization. Compared with the traditional single-path routing planning, the present invention can realize high link utilization rate and effectively reduce network congestion by utilizing the characteristics of intensive learning and environment constantly interacting and adjusting strategies.

Figure 201911183909

Description

SDN multi-path routing planning method based on reinforcement learning
Technical Field
The invention relates to the field of network communication technology and reinforcement learning, in particular to an SDN multi-path route planning method based on reinforcement learning.
Background
In recent years, with the popularization of the internet, particularly with the appearance of related technologies such as cloud computing and big data, the internet has entered a rapid development period. The rapid development of the internet enables the data volume of network transmission services to increase rapidly, and particularly, in recent years, with the rise of short video and live broadcast platforms, the interaction of network services is more real-time, and a terminal user puts higher demands on the Quality of Service (QoS) of the network services. However, in the case of limited network resources, the continuous increase of internet traffic data may cause problems such as a sharp increase of bandwidth consumption, difficulty in guaranteeing quality of service, and an increase of security problems. Obviously, the traditional network architecture is difficult to meet the diversified requirements of users. In view of the foregoing, there is a need in the internet industry for a new network architecture that addresses existing network problems, and that is more flexible and efficient than conventional architectures to meet the ever-increasing traffic data needs of society.
The SDN is a novel network architecture, is widely concerned by various borders, and solves some problems which cannot be avoided in the traditional network. In the traditional network architecture, each device can independently make a forwarding rule and transmit information through a series of network protocols (such as TCP/IP), under the system structure, the control and the forwarding of the network device are closely coupled, the network device can only plan a path by taking the network device as a center as a flow service and does not have network global resource information, and the problems of network link congestion and the like are easily caused. SDN forwards and control separation, can obtain link information in real time through the OpenFlow protocol, be favorable to the centralized control of network for the control layer obtains network global resource information, and carry out unified management and distribution according to the demand of business, and simultaneously, centralized control still makes whole network regard as a whole, convenient maintenance. Compared with the traditional IP network, the SDN network solves the problems of inaccurate routing information, low routing efficiency and the like of the traditional network, and lays a foundation for realizing intelligent routing planning according to the requirements of different flows. Therefore, the research on the SDN network architecture is of great significance. Routing is an indispensable component of both traditional networks and SDN networks, however, the basic adopted by the current mainstream SDN routing modules is Dijkstra (shortest path) algorithm, if all data packets depend on the shortest path algorithm only, data flows are easy to cause link congestion due to the selection of the same link, and other links are placed in an idle state, which greatly reduces link utilization. On the other hand, the shortest path algorithm is an algorithm for finding the shortest path in graph theory, and when the algorithm is operated, the shortest path from the source node to all other nodes in the topology is actually obtained, so the time complexity of the algorithm is high. There are also protocols that support multipath, such as ECMP, but these protocols do not take into account the quality of service requirements of different traffic streams. Therefore, a better routing strategy is needed in the SDN network to generate routes, improve the performance of the network, and guarantee the service quality of different service flows.
Disclosure of Invention
The invention provides a high-bandwidth-utilization-oriented SDN intelligent routing planning technology, which mainly adopts the shortest path as a routing planning algorithm around the current SDN network, so that the problem of low link bandwidth utilization rate and the like is caused.
The technical scheme adopted by the invention for solving the technical problem is as follows: an SDN multi-path route planning method based on reinforcement learning comprises the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and acquiring a characteristic matrix of a stream to be forwarded;
step 2: a QLearning algorithm is adopted as a reinforcement learning model, and the network topology matrix and the characteristic matrix of the flow to be forwarded in the step 1 are input into a reinforcement learning model training Q value table; the reward function R in the QLearning algorithm is as follows:
Figure RE-GDA0002379835710000021
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, 1 when the two nodes are connected and 0 when the two nodes are not connected, and g (x) is) As a cost function, the following is shown:
Figure RE-GDA0002379835710000022
in the formula ImIs the total number of links of the network topology. x is the hop count passed by the data packet in forwarding;
and step 3: and obtaining a path Routing according to the Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the stream
Figure RE-GDA0002379835710000023
Wherein B isCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. The divided streams are passed from the source node to the target node through the current output path. Taking the residual flow as a new flow to flow back to the step 2 to train the Q value table again; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
Further, the traffic characteristic matrix to be forwarded includes a source address, a destination address, a QoS class, and a traffic size of the flow.
Further, the process of training the Q-value table by the QLearning algorithm is specifically as follows:
and setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, let s ═ s', return to (2).
Further, the SDN multipath routing planning method based on reinforcement learning is characterized in that the cost function is defined as that the cost increases with the increase of the number x of hops passed by the data packet forwarding, and the cost function g (x) is e (0,1), and the cost function should satisfy: the curve of the cost function g (x) is an upward convex function curve, and when the total hop number of the data packet tends to infinity, the cost function value tends to 1.
The method has the advantages that reinforcement learning is applied to SDN multi-path routing planning, a QLearning algorithm is used as a reinforcement learning model, different reward functions are set for flows with different QoS levels according to an input network topology matrix and a current flow characteristic matrix to be forwarded, a plurality of paths are planned to forward the flows, and a larger flow is divided into a plurality of small flows under the condition that the link bandwidth is not enough, so that the utilization rate of the link bandwidth is improved.
Drawings
Figure 1 is an SDN multi-path routing planning architecture diagram;
figure 2 is a SDN network topology diagram;
FIG. 3 is a graph of a cost function;
fig. 4 is a flow chart of reinforcement learning-based multi-path routing planning.
Detailed Description
Aiming at the fact that the existing SDN control adopts Dijkstra algorithm as the shortest route searching algorithm, the method tries to apply reinforcement learning to the SDN route. And directly using the network topology environment for the training of the Q value table by utilizing the characteristic of SDN forwarding control separation. Considering that different services have different requirements on QoS, the invention provides routes with different service qualities for different services; and under the condition that the link bandwidth is not enough, a larger flow is divided into a plurality of small flows, so that the link bandwidth utilization rate is improved.
As shown in fig. 1, the present invention provides a reinforcement learning based SDN multi-path route planning method, which includes the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and constructing a network topology graph as shown in figure 2 by using Mininet, wherein the network topology graph comprises 9 OpenFlow switches and 5 hosts; the method comprises the steps of obtaining a characteristic matrix of the flow to be forwarded, setting the bandwidth of each network link to be 200 according to a multi-path routing planning algorithm and an SDN network topology, setting a sending end to be h 1-h 5 and a receiving end to be h 1-h 5, wherein the sending end randomly sends data to other receiving ends with the probability of 20%, and all hosts send 30 static flows in total, wherein the static flows refer to flows which occupy the bandwidth of the link until the experiment is finished once being injected into the network.
Step 2: as shown in fig. 1, using QLearning algorithm as the reinforcement learning model, the multi-path routing algorithm proposed by the present invention uses markov decision process for modeling, and therefore, the model MDP quadruplet proposed by the present invention is defined as follows:
(1) state collection: in a network topology, each switch represents a state, and thus, according to the network topology, a set of network states is defined herein as follows:
S=[s1,s2,s3,…s9]
wherein s is1~s9Representing 9 OpenFlow switches in the network. The source node information of the data packet indicates an initial state of the data packet, and the destination node information indicates a termination state of the data packet. When a certain data packet reaches the destination node, the data packet reaches the termination state. Once the current data packet reaches the termination state, the termination of one round of training is indicated, and the data packet will return to the initial state again for the next round of training.
(2) An action space: in an SDN network, the transmission path of a data packet is determined by the network state, i.e. the data packet can only be transmitted at connected network nodes. According to the network topology, the network connection state is defined as the following formula:
Figure RE-GDA0002379835710000041
since packets can only be transmitted at connected network nodes, the following set of actions for each state S [ i ] ∈ S can be defined herein according to the set of network states and the network connection state:
A(si)={sj|T[si][sj]=1}
indicates that the current state is at siThe state-selectable action set appears as s on the network topologyiDirectly connected nodes sjI.e. the current state siWill only select the state s connected to itj. For example: state s1The action set of (1) is: a(s)1)={s2,s4}。
(3) And (3) state transition: in each round of training, when the data packet is in state siIf the action is not the selected state of the round, the data packet moves to the next state. Another key issue with reinforcement learning is the generation of reward values. When the Agent generates state transition, the system feeds back a reward to the Agent according to the reward function R.
(4) The final purpose of the multi-path routing planning based on reinforcement learning is to plan reasonable multi-paths through training, so the setting of a reward value R is also important, the bandwidth utilization rate and the delay are mainly considered herein, the delay mainly refers to the hop count of the path, and in order to plan different paths for links with different QoS levels, the hop count of the planned path is considered to be smaller as the traffic level β is larger.
1. QoS level β and link utilization η need to be considered;
2.β large flows are encouraged to allocate paths with fewer hops;
in summary, the reward function formula designed herein is as follows:
Figure RE-GDA0002379835710000051
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, the two nodes are 1 when connected and 0 when not connected, and the reward function represents that the data packet is in the state SiWhen the next hop that can be selected (connected) is j (action a)j) Then, from T [ S ]i][Aj]The bonus function when 1 yields a bonus value, otherwise the bonus value is set to-1.
g (x) is a cost function defined as the cost increases as the number of hops x passed by the packet increases, and the cost function g (x) is e (0,1) with lmFor the total number of links in the network topology, considering that when the network topology is large, it is impractical to walk all paths, and a data packet can only be forwarded through a part of links, so the cost function should satisfy: the early stage is increased quickly and becomes stable to the later stage, if the total number of hops passed by the data packet reaches lmThen the cost function value is maximized. In summary, the cost function is as follows:
Figure RE-GDA0002379835710000061
as shown in fig. 3, it can be seen that the cost function is an increasing function, the range of the increasing function is (0,1), the cost increases with the increase of the number of hops x, and the function grows more rapidly in the early stage and tends to be stable in the later stage, which meets the requirement of the cost function.
The second requirement of designing the reward function is to encourage β flows with large flows to allocate paths with small hop count, so the cost function g (x) in the reward function is multiplied by the traffic QoS class β, at this time, under the same condition, the more the number of the path hops passed by the packet forwarding, the larger the cost required by the flows with large QoS class, and the paths with small hop count will be selected by the flows with large QoS class during planning.
After the MDP quadruples are determined, a Q-value table is trained by using a Qlearning algorithm, which comprises the following specific steps.
And setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, the process returns to step (2) with s ═ s'.
As shown in fig. 4, the learning rate α is set to 0.8, the discount rate γ is set to 0.6, and the value of the action strategy ∈ -greedy strategy ∈ is equal to 0.1.
And obtaining a path Routing according to the trained Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the stream
Figure RE-GDA0002379835710000062
Of (a) wherein BCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. And the small flow reaches the target node from the source destination node through the current output path. Use of
Figure RE-GDA0002379835710000063
Updating the flow, wherein B represents the size of the planned flow, namely returning the rest flows as new flows to step 2 for the training of the Q value table; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims (4)

Translated fromChinese
1.一种基于强化学习的SDN多路径路由规划方法,其特征在于,该方法包括以下步骤:1. a SDN multi-path routing planning method based on reinforcement learning, is characterized in that, this method comprises the following steps:步骤1:采集网络的可用带宽信息、总带宽信息、节点信息和链路信息构建网络拓扑矩阵,并获取待转发流特征矩阵;Step 1: collect available bandwidth information, total bandwidth information, node information and link information of the network to construct a network topology matrix, and obtain a flow feature matrix to be forwarded;步骤2:采用QLearning算法作为强化学习模型,将步骤1中网络拓扑矩阵和待转发流特征矩阵输入到强化学习模型训练Q值表;所述QLearning算法中奖励函数R如下:Step 2: Using the QLearning algorithm as the reinforcement learning model, input the network topology matrix and the feature matrix of the flow to be forwarded in step 1 into the reinforcement learning model training Q-value table; the reward function R in the QLearning algorithm is as follows:
Figure FDA0002291955560000011
Figure FDA0002291955560000011
其中:Rt(Si,Aj)表示数据包从状态Si选择动作Aj时得到的奖励,在路由规划任务中表现为数据包在节点i时选择的下一跳为节点j时产生的奖励;β为流量QoS等级;η为带宽利用率;d为目的节点;δ(j-d)为冲激函数,表示当数据包下一跳节点为目的节点时,该值为1;T为网络拓扑节点的连接状态,两个节点相连时为1,不相连时为0;g(x)为代价函数,如下所示:Among them: Rt (Si , Aj ) represents the reward obtained when the data packet selects action Aj from state Si, and in the routing planning task, it is expressed as the next hop selected by the data packet at nodei is generated when node j is β is the traffic QoS level; η is the bandwidth utilization rate; d is the destination node; The connection status of topological nodes, 1 when two nodes are connected, and 0 when they are not connected; g(x) is the cost function, as shown below:
Figure FDA0002291955560000012
Figure FDA0002291955560000012
式中,lm为网络拓扑总链路数。x为数据包转发时经过的跳数;In the formula,lm is the total number of links in the network topology. x is the number of hops passed by the packet forwarding;步骤3:根据Q值表得到一条路径Routing,放入路径集合Routing(S,D)中,并判断Routing的最小链路带宽是否小于流的带宽。若是,则从流中划分出一条大小为
Figure FDA0002291955560000013
的流,其中B可用表示当前输出路径的最小链路可用带宽,β表示当前流的Qos等级,∑iβi表示总QoS等级。将被划分出的流从源节点通过当前输出路径达到目标节点。把剩余流作为新的流回到步骤2重新进行Q值表的训练;若否,规划结束,从Routing(S,D)中得到规划后的多路径路由。
Step 3: Obtain a route Routing according to the Q value table, put it into the route set Routing(S, D), and judge whether the minimum link bandwidth of Routing is smaller than the bandwidth of the flow. If so, divide a stream from the stream with a size of
Figure FDA0002291955560000013
where Bavailable represents the minimum link available bandwidth of the current output path, β represents the QoS level of the current flow, and ∑i βi represents the total QoS level. The divided stream is routed from the source node to the destination node through the current output path. Return the remaining flow as a new flow to step 2 to re-train the Q-value table; if not, the planning is over, and the planned multi-path routing is obtained from Routing(S, D).
2.根据权利要求1所述的一种基于强化学习的SDN多路径路由规划方法,其特征在于,所述待转发流量特征矩阵把一条流的信息以矩阵形式表现出来,其中包含流的源地址、目的地址、QoS等级、流量大小。2. a kind of SDN multi-path routing planning method based on reinforcement learning according to claim 1, is characterized in that, described flow characteristic matrix to be forwarded expresses the information of a flow in matrix form, wherein contains the source address of flow , destination address, QoS level, traffic size.3.根据权利要求1所述的一种基于强化学习的SDN多路径路由规划方法,其特征在于,所述QLearning算法训练Q值表的过程具体如下:3. a kind of SDN multi-path routing method based on reinforcement learning according to claim 1, is characterized in that, the process of described QLearning algorithm training Q value table is as follows:设置单次训练最大步数后,进行如下步骤;After setting the maximum number of steps for a single training, perform the following steps;(1)初始化Q值表和奖励函数R;(1) Initialize the Q value table and the reward function R;(2)采取动作ε-贪心策略P,选择动作a;(2) Take action ε-greedy strategy P and choose action a;(3)执行动作a,转移到状态s′,用奖励函数R计算奖励值,更新Q值表;(3) Execute action a, transfer to state s', calculate the reward value with reward function R, and update the Q value table;(4)判断s′是否为目的节点。若不是,使s=s′,回到步骤(2)。(4) Determine whether s' is the destination node. If not, set s=s' and go back to step (2).4.根据权利要求3所述的一种基于强化学习的SDN多路径路由规划方法,其特征在于,所述代价函数定义为随着数据包转发时经过的跳数x的增加,代价增加,且代价函数g(x)∈(0,1),代价函数应满足:代价函数g(x)的曲线为上凸函数曲线,当数据包经过的总跳数趋向无穷时,代价函数值趋向1。4. a kind of SDN multi-path routing method based on reinforcement learning according to claim 3, is characterized in that, described cost function is defined as with the increase of the hop number x that passes through when the data packet is forwarded, the cost increases, and The cost function g(x)∈(0,1), the cost function should satisfy: the curve of the cost function g(x) is an upwardly convex function curve, when the total number of hops passed by the data packet tends to infinity, the cost function value tends to 1.
CN201911183909.4A2019-11-272019-11-27SDN multi-path routing planning method based on reinforcement learningActiveCN110986979B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911183909.4ACN110986979B (en)2019-11-272019-11-27SDN multi-path routing planning method based on reinforcement learning

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911183909.4ACN110986979B (en)2019-11-272019-11-27SDN multi-path routing planning method based on reinforcement learning

Publications (2)

Publication NumberPublication Date
CN110986979Atrue CN110986979A (en)2020-04-10
CN110986979B CN110986979B (en)2021-09-10

Family

ID=70087419

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911183909.4AActiveCN110986979B (en)2019-11-272019-11-27SDN multi-path routing planning method based on reinforcement learning

Country Status (1)

CountryLink
CN (1)CN110986979B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111917657A (en)*2020-07-022020-11-10北京邮电大学Method and device for determining flow transmission strategy
CN112398733A (en)*2020-11-242021-02-23新华三大数据技术有限公司Traffic scheduling forwarding method and device
CN112671648A (en)*2020-12-222021-04-16北京浪潮数据技术有限公司SDN data transmission method, SDN, device and medium
CN112822109A (en)*2020-12-312021-05-18上海缔安科技股份有限公司SDN core network QoS route optimization algorithm based on reinforcement learning
CN113098771A (en)*2021-03-262021-07-09哈尔滨工业大学Distributed self-adaptive QoS routing method based on Q learning
CN113158543A (en)*2021-02-022021-07-23浙江工商大学Intelligent prediction method for software defined network performance
CN113301098A (en)*2021-01-052021-08-24阿里巴巴集团控股有限公司Path planning method, CDN connection establishing method, device and storage medium
CN113347108A (en)*2021-05-202021-09-03中国电子科技集团公司第七研究所SDN load balancing method and system based on Q-learning
CN113347104A (en)*2021-05-312021-09-03国网山东省电力公司青岛供电公司SDN-based routing method and system for power distribution Internet of things
CN113489654A (en)*2021-07-062021-10-08国网信息通信产业集团有限公司Routing method, routing device, electronic equipment and storage medium
CN114124828A (en)*2022-01-272022-03-01广东省新一代通信与网络创新研究院Machine learning method and device based on programmable switch
CN114845359A (en)*2022-03-142022-08-02中国人民解放军军事科学院战争研究院Multi-intelligent heterogeneous network selection method based on Nash Q-Learning
WO2022257917A1 (en)*2021-06-112022-12-15华为技术有限公司Path planning method and related device
CN115550236A (en)*2022-08-312022-12-30国网江西省电力有限公司信息通信分公司Data protection method for routing optimization of security middlebox resource pool
US11606265B2 (en)2021-01-292023-03-14World Wide Technology Holding Co., LLCNetwork control in artificial intelligence-defined networking
CN115941579A (en)*2022-11-102023-04-07北京工业大学 A Hybrid Routing Method Based on Deep Reinforcement Learning
CN116582479A (en)*2023-06-092023-08-11南京邮电大学 A service flow routing method, system, storage medium and device
CN117033005A (en)*2023-10-072023-11-10之江实验室Deadlock-free routing method and device, storage medium and electronic equipment
CN118233671A (en)*2024-03-062024-06-21中国人民解放军国防科技大学Multipath video transmission method based on multi-agent deep reinforcement learning
US12175364B2 (en)2021-01-292024-12-24World Wide Technology Holding Co., LLCReinforcement-learning modeling interfaces
US12373702B2 (en)2021-01-292025-07-29World Wide Technology Holding Co., LLCTraining a digital twin in artificial intelligence-defined networking

Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104640168A (en)*2014-12-042015-05-20北京理工大学Q-learning based vehicular ad hoc network routing method
CN106713143A (en)*2016-12-062017-05-24天津理工大学Adaptive reliable routing method for VANETs
CN107104819A (en)*2017-03-232017-08-29武汉邮电科学研究院Adaptive self-coordinating unified communications and communication means based on SDN
US20180034922A1 (en)*2016-07-282018-02-01At&T Intellectual Property I, L.P.Network configuration for software defined network via machine learning
CN107911299A (en)*2017-10-242018-04-13浙江工商大学A kind of route planning method based on depth Q study
CN109005471A (en)*2018-08-072018-12-14安徽大学Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment
CN109361601A (en)*2018-10-312019-02-19浙江工商大学 A SDN Routing Planning Method Based on Reinforcement Learning
CN109450794A (en)*2018-12-112019-03-08上海云轴信息科技有限公司A kind of communication means and equipment based on SDN network
CN109547340A (en)*2018-12-282019-03-29西安电子科技大学SDN data center network jamming control method based on heavy-route
US20190123974A1 (en)*2016-06-232019-04-25Huawei Technologies Co., Ltd.Method for generating routing control action in software-defined network and related device
CN109768940A (en)*2018-12-122019-05-17北京邮电大学 Traffic distribution method and device for multi-service SDN network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104640168A (en)*2014-12-042015-05-20北京理工大学Q-learning based vehicular ad hoc network routing method
US20190123974A1 (en)*2016-06-232019-04-25Huawei Technologies Co., Ltd.Method for generating routing control action in software-defined network and related device
US20180034922A1 (en)*2016-07-282018-02-01At&T Intellectual Property I, L.P.Network configuration for software defined network via machine learning
CN106713143A (en)*2016-12-062017-05-24天津理工大学Adaptive reliable routing method for VANETs
CN107104819A (en)*2017-03-232017-08-29武汉邮电科学研究院Adaptive self-coordinating unified communications and communication means based on SDN
CN107911299A (en)*2017-10-242018-04-13浙江工商大学A kind of route planning method based on depth Q study
CN109005471A (en)*2018-08-072018-12-14安徽大学Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment
CN109361601A (en)*2018-10-312019-02-19浙江工商大学 A SDN Routing Planning Method Based on Reinforcement Learning
CN109450794A (en)*2018-12-112019-03-08上海云轴信息科技有限公司A kind of communication means and equipment based on SDN network
CN109768940A (en)*2018-12-122019-05-17北京邮电大学 Traffic distribution method and device for multi-service SDN network
CN109547340A (en)*2018-12-282019-03-29西安电子科技大学SDN data center network jamming control method based on heavy-route

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TRUONG THU HUONG 等: "A global multipath load-balanced routing algorithm Reinforcement Learning", 《2019 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC)》*
金子晋 等: "SDN环境下基于QLearning算法的业务划分路由选路机制", 《网络与信息安全学报》*

Cited By (29)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111917657A (en)*2020-07-022020-11-10北京邮电大学Method and device for determining flow transmission strategy
CN112398733B (en)*2020-11-242022-03-25新华三大数据技术有限公司Traffic scheduling forwarding method and device
CN112398733A (en)*2020-11-242021-02-23新华三大数据技术有限公司Traffic scheduling forwarding method and device
CN112671648A (en)*2020-12-222021-04-16北京浪潮数据技术有限公司SDN data transmission method, SDN, device and medium
CN112822109A (en)*2020-12-312021-05-18上海缔安科技股份有限公司SDN core network QoS route optimization algorithm based on reinforcement learning
CN113301098A (en)*2021-01-052021-08-24阿里巴巴集团控股有限公司Path planning method, CDN connection establishing method, device and storage medium
US11606265B2 (en)2021-01-292023-03-14World Wide Technology Holding Co., LLCNetwork control in artificial intelligence-defined networking
US12373702B2 (en)2021-01-292025-07-29World Wide Technology Holding Co., LLCTraining a digital twin in artificial intelligence-defined networking
US12175364B2 (en)2021-01-292024-12-24World Wide Technology Holding Co., LLCReinforcement-learning modeling interfaces
CN113158543A (en)*2021-02-022021-07-23浙江工商大学Intelligent prediction method for software defined network performance
CN113158543B (en)*2021-02-022023-10-24浙江工商大学Intelligent prediction method for software defined network performance
CN113098771A (en)*2021-03-262021-07-09哈尔滨工业大学Distributed self-adaptive QoS routing method based on Q learning
CN113347108A (en)*2021-05-202021-09-03中国电子科技集团公司第七研究所SDN load balancing method and system based on Q-learning
CN113347108B (en)*2021-05-202022-08-02中国电子科技集团公司第七研究所 A Q-learning-based SDN load balancing method and system
CN113347104A (en)*2021-05-312021-09-03国网山东省电力公司青岛供电公司SDN-based routing method and system for power distribution Internet of things
WO2022257917A1 (en)*2021-06-112022-12-15华为技术有限公司Path planning method and related device
CN113489654A (en)*2021-07-062021-10-08国网信息通信产业集团有限公司Routing method, routing device, electronic equipment and storage medium
CN113489654B (en)*2021-07-062024-01-05国网信息通信产业集团有限公司Routing method, device, electronic equipment and storage medium
CN114124828A (en)*2022-01-272022-03-01广东省新一代通信与网络创新研究院Machine learning method and device based on programmable switch
CN114845359A (en)*2022-03-142022-08-02中国人民解放军军事科学院战争研究院Multi-intelligent heterogeneous network selection method based on Nash Q-Learning
CN115550236A (en)*2022-08-312022-12-30国网江西省电力有限公司信息通信分公司Data protection method for routing optimization of security middlebox resource pool
CN115550236B (en)*2022-08-312024-04-30国网江西省电力有限公司信息通信分公司Data protection method oriented to security middle station resource pool route optimization
CN115941579A (en)*2022-11-102023-04-07北京工业大学 A Hybrid Routing Method Based on Deep Reinforcement Learning
CN115941579B (en)*2022-11-102024-04-26北京工业大学 A hybrid routing method based on deep reinforcement learning
CN116582479A (en)*2023-06-092023-08-11南京邮电大学 A service flow routing method, system, storage medium and device
CN117033005B (en)*2023-10-072024-01-26之江实验室Deadlock-free routing method and device, storage medium and electronic equipment
CN117033005A (en)*2023-10-072023-11-10之江实验室Deadlock-free routing method and device, storage medium and electronic equipment
CN118233671B (en)*2024-03-062024-08-16中国人民解放军国防科技大学Multipath video transmission method based on multi-agent deep reinforcement learning
CN118233671A (en)*2024-03-062024-06-21中国人民解放军国防科技大学Multipath video transmission method based on multi-agent deep reinforcement learning

Also Published As

Publication numberPublication date
CN110986979B (en)2021-09-10

Similar Documents

PublicationPublication DateTitle
CN110986979B (en)SDN multi-path routing planning method based on reinforcement learning
CN112491714B (en)Intelligent QoS route optimization method and system based on deep reinforcement learning in SDN environment
CN109361601B (en) A SDN Routing Planning Method Based on Reinforcement Learning
Xu et al.Experience-driven networking: A deep reinforcement learning based approach
CN112822109B (en)SDN core network QoS route optimization method based on reinforcement learning
CN107094115B (en) An Ant Colony Optimization Load Balancing Routing Algorithm Based on SDN
CN108881048B (en) A Congestion Control Method for Named Data Networks Based on Reinforcement Learning
CN109951335B (en)Satellite network delay and rate combined guarantee routing method based on time aggregation graph
CN110601973B (en) A route planning method, system, server and storage medium
CN112600759B (en)Multipath traffic scheduling method and system based on deep reinforcement learning under Overlay network
CN110995590A (en) An Efficient Routing Method in Distributed Area Network
CN105897575A (en)Path computing method based on multi-constrained path computing strategy under SDN
CN106789648A (en)Software defined network route decision method based on content storage with network condition
CN114143264A (en) A traffic scheduling method based on reinforcement learning in SRv6 network
CN108040012A (en)Multi-object multicast routed path construction method in the SDN network that must be searched for based on longicorn
CN113518035B (en) Routing determination method and device
CN101986628B (en)Method for realizing multisource multicast traffic balance based on ant colony algorithm
CN117294643A (en)Network QoS guarantee routing method based on SDN architecture
CN115037669A (en) A cross-domain data transfer method based on federated learning
Qadeer et al.Flow-level dynamic bandwidth allocation in SDN-enabled edge cloud using heuristic reinforcement learning
Stepanov et al.On fair traffic allocation and efficient utilization of network resources based on MARL
Wei et al.G-Routing: graph neural networks-based flexible online routing
Mao et al.On a cooperative deep reinforcement learning-based multi-objective routing strategy for diversified 6G metaverse services
CN120075121A (en)Multi-agent deep reinforcement learning-based multi-path routing method for communication network
CN114745322A (en) Video Stream Routing Method Based on Genetic Algorithm in SDN Environment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp