基于深度强化学习的SDN服务质量智能优化算法

廖岑卉珊; 陈俊彦; 梁观平; 谢小兰; 卢小烨

doi:10.11959/j.issn.2096-3750.2023.00316

您当前的位置：

首页 >

文章列表页 >

基于深度强化学习的SDN服务质量智能优化算法

理论与技术 | 更新时间：2024-08-16

- 基于深度强化学习的SDN服务质量智能优化算法
- Quality of service optimization algorithm based on deep reinforcement learning in software defined network
- 物联网学报 2023年7卷第1期页码：73-82
- 作者机构：
  
  1. 桂林电子科技大学计算机与信息安全学院，广西桂林 541004
  2. 国防科技大学计算机学院，湖南长沙 410073
- 作者简介：
  
  [ "廖岑卉珊（1999- ），女，桂林电子科技大学硕士生，主要研究方向为软件定义网络、深度强化学习" ]
  [ "陈俊彦（1985- ），男，博士，桂林电子科技大学高级实验师，主要研究方向为强化学习、图神经网络和软件定义网络" ]
  [ "梁观平（1998- ），男，国防科技大学博士生，主要研究方向为软件定义网络、流量调度与拥塞控制" ]
  [ "谢小兰（1999- ），女，桂林电子科技大学硕士生，主要研究方向为软件定义网络、图神经网络和深度强化学习" ]
  [ "卢小烨（2000- ），男，桂林电子科技大学在读，主要研究方向为软件定义网络、深度强化学习" ]
- 基金信息：
  
  广西自然科学基金资助项目;The Guangxi Natural Science Foundation(2020GXNSFDA238001);广西高校中青年教师科研基础能力提升项目;The Guangxi Project to Improve the Scientific Research Basic Ability of Middle Aged and Young Teachers(2020KY05033)
- DOI：10.11959/j.issn.2096-3750.2023.00316
  中图分类号： TP393
- 纸质出版日期：2023-03-30，
  
  网络出版日期：2023-03，
- 稿件说明：
移动端阅览
廖岑卉珊, 陈俊彦, 梁观平, 等. 基于深度强化学习的SDN服务质量智能优化算法[J]. 物联网学报, 2023,7(1):73-82.

CENHUISHAN LIAO, JUNYAN CHEN, GUANPING LIANG, et al. Quality of service optimization algorithm based on deep reinforcement learning in software defined network. [J]. Chinese journal on internet of things, 2023, 7(1): 73-82.
廖岑卉珊, 陈俊彦, 梁观平, 等. 基于深度强化学习的SDN服务质量智能优化算法[J]. 物联网学报, 2023,7(1):73-82. DOI： 10.11959/j.issn.2096-3750.2023.00316.

CENHUISHAN LIAO, JUNYAN CHEN, GUANPING LIANG, et al. Quality of service optimization algorithm based on deep reinforcement learning in software defined network. [J]. Chinese journal on internet of things, 2023, 7(1): 73-82. DOI： 10.11959/j.issn.2096-3750.2023.00316.

摘要

深度强化学习具有较强的决策能力和泛化能力，常被应用于软件定义网络（SDN

software defined network）的服务质量（QoS

quality of service）优化中。但传统深度强化学习算法存在收敛速度慢和不稳定等问题。提出一种基于深度强化学习的服务质量优化算法（AQSDRL

algorithm of quality of service optimization based on deep reinforcement learning），以解决SDN在数据中心网络（DCN

data center network）应用中的QoS问题。AQSDRL引入基于softmax估计的深层双确定性策略梯度（SD3

softmax deep double deterministic policy gradient）算法实现模型训练，并采用基于 SumTree 的优先级经验回放机制优化 SD3 算法，以更大的概率抽取具有更显著时序差分误差（TD-error

temporal-difference error）的样本来训练神经网络，有效提升算法的收敛速度和稳定性。实验结果表明，所提AQSDRL与现有的深度强化学习算法相比能够有效降低网络传输时延，且提高网络的负载均衡性能。

Abstract

Deep reinforcement learning has strong abilities of decision-making and generalization and often applies to the quality of service (QoS) optimization in software defined network (SDN).However

traditional deep reinforcement learning algorithms have problems such as slow convergence and instability.An algorithm of quality of service optimization algorithm of based on deep reinforcement learning (AQSDRL) was proposed to solve the QoS problem of SDN in the data center network (DCN) applications.AQSDRL introduces the softmax deep double deterministic policy gradient (SD3) algorithm for model training

and a SumTree-based prioritized empirical replay mechanism was used to optimize the SD3 algorithm.The samples with more significant temporal-difference error (TD-error) were extracted with higher probability to train the neural network

effectively improving the convergence speed and stability of the algorithm.The experimental results show that the proposed AQSDRL effectively reduces the network transmission delay and improves the load balancing performance of the network than the existing deep reinforcement learning algorithms.

关键词

深度强化学习软件定义网络服务质量数据中心网络SumTree

Keywords

deep reinforcement learningSDNQoSDCNSumTree

references

TANHA M, SAJJADI D, RUBY R ,et al. Traffic engineering enhancement by progressive migration to SDN[J]. IEEE Communications Letters, 2018,22(3): 438-441.

ONGARO F, CERQUEIRA E, FOSCHINI L ,et al. Enhancing the quality level support for real-time multimedia applications in software-defined networks[C]// Proceedings of 2015 International Conference on Computing,Networking and Communications (ICNC). Piscataway:IEEE Press, 2015: 505-509.

ALIZADEH M, EDSALL T, DHARMAPURIKAR S ,et al. CONGA:distributed congestion-aware load balancing for datacenters[C]// Proceedings of the 2014 ACM conference on SIGCOMM. New York:ACM Press, 2014: 503-514.

ZOU G B, LI T F, JIANG M ,et al. Deep TSQP:temporal-aware service QoS prediction via deep neural network and feature integration[J]. Knowledge-Based Systems, 2022,241:108062.

CHEN J Y, WANG Y, HUANG X F ,et al. ALBLP:adaptive load-balancing architecture based on link-state prediction in software-defined networking[J]. Wireless Communications and Mobile Computing,2022, 2022:8354150.

NUGRAHAB , MURTHYRN . Deep learning-based slow DDoS attack detection in SDN-based networks[C]// Proceedings of 2020 IEEE Conference on Network Function Virtualization and Software Defined Networks. Piscataway:IEEE Press, 2020: 51-56.

NOVAES M P, CARVALHO L F, LLORET J ,et al. Long short-term memory and fuzzy logic for anomaly detection and mitigation in software-defined network environment[J]. IEEE Access, 2020(8): 83765-83781.

BHATIA J, DAVE R, BHAYANI H ,et al. SDN-based real-time urban traffic analysis in VANET environment[J]. Computer Communications, 2020,149: 162-175.

TROIA S, ALVIZU R, MAIER G . Reinforcement learning for service function chain reconfiguration in NFV-SDN metro-core optical networks[J]. IEEE Access, 2019(7): 167944-167957.

LIN S C, AKYILDIZ I F, WANG P ,et al. QoS-aware adaptive routing in multi-layer hierarchical software defined networks:a reinforcement learning approach[C]// Proceedings of 2016 IEEE International Conference on Services Computing. Piscataway:IEEE Press, 2016: 25-33.

YOUNUS M U, KHAN M K, ANJUM M R ,et al. Optimizing the lifetime of software defined wireless sensor network via reinforcement learning[J]. IEEE Access, 2020(9): 259-272.

CASAS-VELASCO D M, RENDON O M C, DA FONSECA N L S . Intelligent routing based on reinforcement learning for software-defined networking[J]. IEEE Transactions on Network and Service Management, 2021,18(1): 870-881.

AL-JAWAD A, COMŞA I S, SHAH P ,et al. An innovative reinforcement learning-based framework for quality of service provisioning over multimedia-based SDN environments[J]. IEEE Transactions on Broadcasting, 2021,67(4): 851-867.

XU Z Y, WU K, ZHANG W Y ,et al. PnP-DRL:a plug-and-play deep reinforcement learning approach for experience-driven networking[J]. IEEE Journal on Selected Areas in Communications, 2021,39(8): 2476-2486.

LIU W X, CAI J, CHEN Q C ,et al. DRL-R:deep reinforcement learning approach for intelligent routing in software-defined data-center networks[J]. Journal of Network and Computer Applications, 2021,177:102865.

HU Y X, LI Z Y, LAN J L ,et al. EARS:intelligence-driven experiential network architecture for automatic routing in software-defined networking[J]. China Communications, 2020,17(2): 149-162.

BOUZIDI E H, OUTTAGARTS A, LANGAR R ,et al. Deep Q-network and traffic prediction based routing optimization in software defined networks[J]. Journal of Network and Computer Applications, 2021(192):103181.

兰巨龙, 于倡和, 胡宇翔 ,等. 基于深度增强学习的软件定义网络路由优化机制[J]. 电子与信息学报, 2019,41(11): 2669-2674.

LAN J L, YU C H, HU Y X ,et al. A SDN routing optimization mechanism based on deep reinforcement learning[J]. Journal of Electronics ＆ Information Technology, 2019,41(11): 2669-2674.

兰巨龙, 张学帅, 胡宇翔 ,等. 基于深度强化学习的软件定义网络QoS优化[J]. 通信学报, 2019,40(12): 60-67.

LAN J L, ZHANG X S, HU Y X ,et al. Software-defined networking QoS optimization based on deep reinforcement learning[J]. Journal on Communications, 2019,40(12): 60-67.

MAI T L, YAO H P, ZHANG N ,et al. Transfer reinforcement learning aided distributed network slicing optimization in industrial IoT[J]. IEEE Transactions on Industrial Informatics, 2022,18(6): 4308-4316.

CHEN J Y, WANG Y, OU J T ,et al. ALBRL:automatic load-balancing architecture based on reinforcement learning in software-defined networking[J]. Wireless Communications and Mobile Computing,2022, 2022:3866143.

FUJIMOTO S, VAN HOOF H, MEGER D ,et al. Addressing function approximation error in actor-critic methods[EB]. 2018.

孙鹏浩, 兰巨龙, 申涓 ,等. 一种基于深度增强学习的智能路由技术[J]. 电子学报, 2020,48(11): 2170-2177.

SUN P H, LAN J L, SHEN J ,et al. An intelligent routing technology based on deep reinforcement learning[J]. Acta Electronica Sinica, 2020,48(11): 2170-2177.

孙鹏浩, 兰巨龙, 申涓 ,等. 基于牵引控制的深度强化学习路由策略生成[J]. 计算机研究与发展, 2021,58(7): 1563-1572.

SUN P H, LAN J L, SHEN J ,et al. Pinning control-based routing policy generation using deep reinforcement learning[J]. Journal of Computer Research and Development, 2021,58(7): 1563-1572.

PAN L, CAI Q P, HUANG L B . Softmax deep double deterministic policy gradients[J]. Advances in Neural Information Processing Systems, 2020(33): 11767-11777.

KHAN A A, ZAFRULLAH M, HUSSAIN M ,et al. Performance analysis of OSPF and hybrid networks[C]// Proceedings of 2017 International Symposium on Wireless Systems and Networks (ISWSN). Piscataway:IEEE Press, 2017: 1-4.

CHIESA M, KINDLER G, SCHAPIRA M . Traffic engineering with equal-cost-multipath:an algorithmic perspective[J]. IEEE/ACM Transactions on Networking, 2017,25(2): 779-792.

SUN P H, GUO Z H, LI J F ,et al. Enabling scalable routing in software-defined networks with deep reinforcement learning on critical nodes[J]. IEEE/ACM Transactions on Networking, 2022,30(2): 629-640.

SUN P H, LAN J L, LI J F ,et al. A scalable deep reinforcement learning approach for traffic engineering based on link control[J]. IEEE Communications Letters, 2021,25(1): 171-175.

National Science Foundation. National science foundation network[EB]. 2022.

浏览量

184

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于能量收集技术的协作卸载计算方案

基于连续动作空间深度强化学习的多数据融合室内定位方法

基于深度强化学习的智能车间调度方法研究

基于深度强化学习的物联网智能路由策略

基于深度强化学习的无人机自主部署及能效优化策略