CN-UPB/NFVdeepPublic

NotificationsYou must be signed in to change notification settings
Fork15
Star63

NFVdeep: Deep Reinforcement Learning for Online Orchestration of Service Function Chains

License

MIT license

63 stars 15 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
data		data
figures		figures
nfvdeep		nfvdeep
results		results
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
graph_generator.py		graph_generator.py
script.py		script.py
setup.py		setup.py
tune.py		tune.py

Repository files navigation

NFVdeep

Deep Reinforcement Learning for Online Orchestration of Service Function Chains

Disclaimer: This is an unofficial implementation, trying to reproduce the deep reinforcement learning approach described in theNFVdeep paper by Xiao et al. as part of a graduate student project.While the implemented agent did learn over time, we were not able to reproduce the same results stated in the paper.

Advisor:Stefan Schneider

Developers:Nils Rudminat,Stefan Werner

Setup

Assuming an Anaconda (version 4.8.4) distribution has already been installed on an Ubuntu 18.04 machine, the environment can simply be cloned viaconda env create -f environment.yml. Depending on your system's setup, the installation of additional packages forRayTune and the appliedTensorflow version might be necessary.

Experiments

Thescript.py file serves as an interface to running either baseline or DRL agents on the NFVdeep environment with their default parameterization, i.e. whithout employing hyperparamter optimization. Here, you can specify the overlay topology and the network's resources, as well as properties of the arrival process. For instance, we may train an stable-baseline'sPPO DRL agent on theabilene network with incoming requests arising from a Poisson process by executing:

python script.py --agent PPO --overlay <data path>/abilene.gpickle --requests <request path>/requests.json --output <output path> --logs <log path>

Hyperparameter Optimization

We employ distributed (single node) Bayesian Optimization withBoTorch andRayTune in order to faciliate scalable hyperparameter optimization for our Reinforcement Learning agent. Specifically, we first specify a parameter search space from whom agent configurations are first sampled and subsequently evaluated. Here,tune.py provides an interface to access our implementation's tuned DRL agents. Note, however, that absolute paths must be used, for instance by executing:

python tune.py --agent PPO --overlay <abs data path>/abilene.gpickle --requests <abs request path>/requests.json --output <abs output path> --logs <abs log path>

Retrieving Placement Decisions

The placement decisions for VNFs of arriving service requests are automatically tabulated in theplacements.txt file (shown below) under the--output path, whereas recorded monitoring metrics such as the obtained reward or acceptance rate are logged toresults.csv. For each episode, trial and arriving service function request, we tabulate its arrival time, time-to-live, bandwidth demands, maximum end-to-end latency, requested VNFs (CPUs & memory) as well as a list of taken placement decisions (node indices). If the list of placements is empty, the request was not embedded to the substrate network and NFVdeep used its in-build backtracking mechanism to release bound resources.

 Episode    Trial    Arrival    TTL    Bandwidth    Max Latency    VNFs (CPUs & memory)     Placements---------  -------  ---------  -----  -----------  -------------  -----------------------  ------------        0        0       4495     73      71.2339          10000  [(7, 7.52), (10, 5.69)]  [0, 2]

Experimental Study

Our evaluation is primarily based on generating arrival times with respect to a Poisson process (exponential arrival and service times) and is only loosely based on the evaluation proposed in the original NFVdeep paper. The load of individual SFCs and VNFs is uniformly sampled within their bounds specified in the respectiverequests.json files. All results simulate the SFC embedding problem on the real-world Abline network topology.

Sampled Input Traffic

First, we train and evaluate under randomly sampled input traffic traces and compare among PPO also its tuned variant against two heuristic baselines, i.e. against the random placement policy and a greedy first fit ('FirstFit') heuristic.

Evidently, neither DRL agent matches the greedy baseline's performance in terms of the cumulated episode reward. However, both DRL agents improve upon random placement decisions and in few cases also accomplish competitive results.

Static Input Traffic

The exogenous input process has a significant influence on an episode's trajectory independent from the respective agent's placement decisions. Therefore, training with episodes subject to randomly generated input traffic might cause high variance in the reward signal and ultimately prohibit effective policy improvement. Therefore, we replay input traffic in our experiments with 'static' input.

In comparison to the previous evaluation setup, the(tuned) PPO policy manages to close the reward gap to theFirstFit baseline significantly even though the greedy heuristic ultimately remains superior.

Latency Constraints

Lastly, we demonstrate that NFVdeep is not provided with effective means to learn concepts related to latency. Specifically, the agent is not provided with information that specifies its last placement decision and cannot determine an informed decision that minimizes latency. Hence, we compare the performance in two related scenarios where only the maximum latency constraints for SFCs vary.

While the ``FirstFit`` baseline accomplishes similar performance in both scenarios (the imposed maximum latency does not pose significant constraints on the placements), we find that the DRL agent's performance deteriorates which is consistent with the prior hypothesis.

About

NFVdeep: Deep Reinforcement Learning for Online Orchestration of Service Function Chains

Releases1

NFVdeep Latest

Oct 9, 2020

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NFVdeep

Setup

Experiments

Hyperparameter Optimization

Retrieving Placement Decisions

Experimental Study

Sampled Input Traffic

Static Input Traffic

Latency Constraints

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Uh oh!

Languages

Movatterモバイル変換

License

CN-UPB/NFVdeep

Folders and files

Latest commit

History

Repository files navigation

NFVdeep

Setup

Experiments

Hyperparameter Optimization

Retrieving Placement Decisions

Experimental Study

Sampled Input Traffic

Static Input Traffic

Latency Constraints

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Uh oh!

Languages