计算机科学 ››2019,Vol. 46 ››Issue (9): 291-297.doi:10.11896/j.issn.1002-137X.2019.09.044
卢海峰, 顾春华, 罗飞, 丁炜超, 袁野, 任强
LU Hai-feng, GU Chun-hua, LUO Fei, DING Wei-chao, YUAN Ye, REN Qiang
摘要:云数据中心的高速发展带来了非常强大的计算能力,但是伴随产生的能耗问题也日益严重。为了降低云数据中心内物理服务器的能耗开销,首先利用强化学习对虚拟机放置问题进行建模,随后结合实际问题从状态聚合和时间信度两个方面对Q-Learning(λ)算法进行优化,最后通过云仿真平台CloudSim和实际数据集对虚拟机放置问题进行实验。实验结果表明,与Q-Learning算法、Greedy算法和PSO算法相比,优化后的Q-Learning(λ)算法更有效地降低了物理服务器的能耗开销,同时针对不同数量的虚拟机放置请求也能够保证更好的结果,具有较强的实用价值。
中图分类号:
卢海峰, 顾春华, 罗飞, 丁炜超, 袁野, 任强.强化学习下能耗优化的虚拟机放置策略[J]. 计算机科学, 2019, 46(9): 291-297. https://doi.org/10.11896/j.issn.1002-137X.2019.09.044
LU Hai-feng, GU Chun-hua, LUO Fei, DING Wei-chao, YUAN Ye, REN Qiang.Virtual Machine Placement Strategy with Energy Consumption Optimization under Reinforcement Learning[J]. Computer Science, 2019, 46(9): 291-297. https://doi.org/10.11896/j.issn.1002-137X.2019.09.044
[1]GAI K,QIU M,ZHAO H,et al.Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing[J].Journal of Network & Computer Applications,2016,59(C):46-54. [2]HAMEED A,KHOSHKBARFOROUSHHA A,RANJAN R,et al.A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems[J].Computing,2016,98(7):751-774. [3]GAI K,QIU M,ZHAO H.Cost-Aware Multimedia Data Allocation for Heterogeneous Memory Using Genetic Algorithm in Cloud Computing[J].IEEE Transactions on Cloud Computing,2016,PP(99):1-1. [4]LINDBERG P,LEINGANG J,LYSAKER D,et al.Comparison and analysis of eight scheduling heuristics for the optimization of energy consumption and makespan in large-scale distributed systems[J].Journal of Supercomputing,2012,59(1):323-360. [5]BELOGLAZOV A,ABAWAJY J,BUYYA R.Energy-aware resource alocation heuristics for eficient management of data centers for cloud computing[J].Future Generation Computer Systems,2012,28(5):755-768. [6]GAO Y,GUAN H,QI Z,et al.A multi-objective ant colony system algorithm for virtual machine placement in cloud computing[J].Journal of Computer & System Sciences,2013,79(8):1230-1242. [7]NEJAD M M,MASHAYEKHY L,GROSU D.Truthful GreedyMechanisms for Dynamic Virtual Machine Provisioning and Allocation in Clouds[J].IEEE Transactions on Parallel & Distri-buted Systems,2015,26(2):594-603. [8]COUTINHO R D C,FROTA Y,OLIVEIRA D D.Optimizingvirtual machine allocation for parallel scientific workflows in federated clouds[J].Future Generation Computer Systems,2015,46(C):51-68. [9]MAO H,ALIZADEH M,MENACHE I,et al.Resource Management with Deep Reinforcement Learning[C]//ACM Workshop on Hot Topics in Networks.ACM,2016:50-56. [10]RUPASINGHE N,GÜVENÇ I.Reinforcement learning for licensed-assisted access of LTE in the unlicensed spectrum[C]//Wireless Communications and Networking Conference.IEEE,2015:1279-1284. [11]SALEEM Y,YAU K L A,MOHAMAD H,et al.Clustering and Reinforcement-Learning-Based Routing for Cognitive Radio Networks[J].IEEE Wireless Communications,2017,24(4):146-151. [12]MORADI M.A centralized reinforcement learning method formulti-agent job scheduling in Grid[C]//International Confe-rence on Computer and Knowledge Engineering.Mashhad:IEEE,2017. [13]BOTVINICK M,WEINSTEIN A,SOLWAY A,et al.Rein-forcement learning,efficient coding,and the statistics of natural tasks[J].Current Opinion in Behavioral Sciences,2015,5:71-77. [14]ZHENG Q,LI R,LI X,et al.A Multi-Objective BiogeographyBased Optimization for Virtual Machine Placement[C]//2015 15th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.Shenzhen:IEEE,2015:687-696. [15]YOU C,HUANG K,CHAE H,et al.Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2017,16(3):1397-1411. [16]GAI K,QIU M.Optimal resource allocation using reinforcement learning for IoT content-centric services [J].Applied Soft Computing,2018,70:12-21. [17]KUMAR M,YADAV A K,KHATRI P,et al.Global host allocation policy for virtual machine in cloud computing[J].International Journal of Information Technology,2018,10(3):279-287. [18]SANTRA S,MALI K.A new approach to survey on load balancing in VM in cloud computing:Using CloudSim[C]//International Conference on Computer,Communication and Control.IEEE,2016:1-5. [19]DUONG T,CHU Y J,NGUYEN T,et al.Virtual MachinePlacement via Q-Learning with Function Approximation[C]//IEEE Global Communications Conference.San Diego:IEEE,2015:1-6. [20]HABIB A,KHAN M I.Reinforcement learning based autonomic virtual machine management in clouds[C]//International Conference on Informatics,Electronics and Vision.Univ Dhaka:IEEE,2016:1083-1088. [21]XU ZX,et al.Deep Reinforcement Learning with Sarsa and Q-Learning:A Hybrid Approach[J].IEICE Transactions on Information and Systems,2018,E101d(9):2315-2322. [22]TENG L,BIN T,YUN A,et al.Parallel reinforcement learning:a framework and case study[J].IEEE/CAA Journal of Automatica Sinica,2018,5(4):827-835. [23]NISHIYAMA R,YAMADA S.Reinforcement Learning withMultiple Actions[C]//Proceedings of the 3rd International Conference on Intelligent Technologies and Engineering Systems.New York:Springer2016:207-213. [24]HOMEM T P D,PERICO D H,SANTOS P E,et al.Improving Reinforcement Learning Results with Qualitative Spatial Representation[C]//Brazilian Conference on Intelligent Systems.Brazil:IEEE,2017:151-156. [25]DUAN Y,CHEN X,HOUTHOOFT R,et al.Benchmarkingdeep reinforcement learning for continuous control[C]//International Conference on International Conference on Machine Learning.New York:ACM,2016:1329-1338. [26]LITTMAN M L.Reinforcement learning improves behaviourfrom evaluative feedback[J].Nature,2015,521(7553):445-451. [27]THERRIEN A S,WOLPERT D M,BASTIAN A J.Effectivereinforcement learning following cerebellar damage requires a balance between exploration and motor noise[J].Brain,2016,139(1):101-114. [28]CUTLER M,WALSH T J,HOW J P.Real-World Reinforcement Learning via Multifidelity Simulators[J].IEEE Transactions on Robotics,2017,31(3):655-671. [29]LEONG Y C,RADULESCU A,DANIEL R,et al.Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments[J].Neuron,2017,93(2):451-463. [30]KIM B G,ZHANG Y,SCHAAR M V D,et al.Dynamic Pricing and Energy Consumption Scheduling With Reinforcement Learning[J].IEEE Transactions on Smart Grid,2016,7(5):2187-2198. [31]XIONG R,CAO J,YU Q.Reinforcement learning-based real-time power management for hybrid energy storage system in the plug-in hybrid electric vehicle[J].Applied Energy,2018,211:538-548. [32] SAMBROOK T D,GOSLIN J.Principal Components Analysis of Reward Prediction Errors in a Reinforcement Learning Task[J].Neuroimage,2016,124(Pt A):276-286. [33]CHEN H,LI X,ZHAO F.A Reinforcement Learning-BasedSleep Scheduling Algorithm for Desired Area Coverage in Solar-Powered Wireless Sensor Networks[J].IEEE Sensors Journal,2016,16(8):2763-2774. |
[1] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182.https://doi.org/10.11896/jsjkx.210800112 |
[2] | 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241.https://doi.org/10.11896/jsjkx.220400148 |
[3] | 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256.https://doi.org/10.11896/jsjkx.210700100 |
[4] | 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204.https://doi.org/10.11896/jsjkx.220200174 |
[5] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253.https://doi.org/10.11896/jsjkx.210400219 |
[6] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279.https://doi.org/10.11896/jsjkx.210600040 |
[7] | 郭雨欣, 陈秀宏. 融合BERT词嵌入表示和主题信息增强的自动摘要模型 Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement 计算机科学, 2022, 49(6): 313-318.https://doi.org/10.11896/jsjkx.210400101 |
[8] | 范静宇, 刘全. 基于随机加权三重Q学习的异策略最大熵强化学习算法 Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning 计算机科学, 2022, 49(6): 335-341.https://doi.org/10.11896/jsjkx.210300081 |
[9] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11.https://doi.org/10.11896/jsjkx.220100249 |
[10] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157.https://doi.org/10.11896/jsjkx.210600226 |
[11] | 张佳能, 李辉, 吴昊霖, 王壮. 一种平衡探索和利用的优先经验回放方法 Exploration and Exploitation Balanced Experience Replay 计算机科学, 2022, 49(5): 179-185.https://doi.org/10.11896/jsjkx.210300084 |
[12] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268.https://doi.org/10.11896/jsjkx.210300155 |
[13] | 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51.https://doi.org/10.11896/jsjkx.210700010 |
[14] | 周琴, 罗飞, 丁炜超, 顾春华, 郑帅. 基于逐次超松弛技术的Double Speedy Q-Learning算法 Double Speedy Q-Learning Based on Successive Over Relaxation 计算机科学, 2022, 49(3): 239-245.https://doi.org/10.11896/jsjkx.201200173 |
[15] | 李素, 宋宝燕, 李冬, 王俊陆. 面向金融活动的复合区块链关联事件溯源方法 Composite Blockchain Associated Event Tracing Method for Financial Activities 计算机科学, 2022, 49(3): 346-353.https://doi.org/10.11896/jsjkx.210700068 |
|