基于多智能体深度强化学习的多域协同抗干扰方法研究

张彪; 汪西明; 徐逸凡; 李文; 韩昊; 刘松仪; 陈学强

doi:10.11959/j.issn.2096-3750.2022.00293

您当前的位置：

首页 >

文章列表页 >

基于多智能体深度强化学习的多域协同抗干扰方法研究

理论与技术 | 更新时间：2024-08-16

- 基于多智能体深度强化学习的多域协同抗干扰方法研究
- Multi-domain collaborative anti-jamming based on multi-agent deep reinforcement learning
- 物联网学报 2022年6卷第4期页码：104-116
- 作者机构：
  
  1. 陆军工程大学通信工程学院，江苏南京 210007
  2. 国防科技大学信息通信学院，湖北武汉 430010
- 作者简介：
  
  [ "张彪（1999- ），男，陆军工程大学通信工程学院硕士生，主要研究方向为智能通信抗干扰和强化学习" ]
  [ "汪西明（1993- ），男，博士，国防科技大学信息通信学院讲师，主要研究方向为智能通信抗干扰、无线资源优化、多智能体决策理论等" ]
  [ "徐逸凡（1995- ），男，博士，陆军工程大学通信工程学院讲师，主要研究方向为无线通信和智能通信抗干扰等" ]
  [ "李文（1996- ），男，陆军工程大学通信工程学院博士生，主要研究方向为智能抗干扰通信、强化学习、博弈论和动态频谱接入等" ]
  [ "韩昊（1996- ），男，陆军工程大学通信工程学院博士生，主要研究方向为智能频谱对抗、智能通信抗干扰、博弈论、机器学习等" ]
  [ "刘松仪（1995- ），男，陆军工程大学通信工程学院博士生，主要研究方向为机器学习、智能抗干扰通信、无线通信资源优化等" ]
  [ "陈学强（1985- ），男，博士，陆军工程大学通信工程学院副教授，主要研究方向为认知无线电、无线频谱资源优化等" ]
- 基金信息：
  
  国家自然科学基金资助项目;The National Natural Science Foundation of China(62071488);国家自然科学基金资助项目;The National Natural Science Foundation of China(61961010)
- DOI：10.11959/j.issn.2096-3750.2022.00293
  中图分类号： TN973.3；TP181
- 纸质出版日期：2022-12-30，
  
  网络出版日期：2022-12，
- 稿件说明：
移动端阅览
张彪, 汪西明, 徐逸凡, 等. 基于多智能体深度强化学习的多域协同抗干扰方法研究[J]. 物联网学报, 2022,6(4):104-116.

BIAO ZHANG, XIMING WANG, YIFAN XU, et al. Multi-domain collaborative anti-jamming based on multi-agent deep reinforcement learning. [J]. Chinese journal on internet of things, 2022, 6(4): 104-116.
张彪, 汪西明, 徐逸凡, 等. 基于多智能体深度强化学习的多域协同抗干扰方法研究[J]. 物联网学报, 2022,6(4):104-116. DOI： 10.11959/j.issn.2096-3750.2022.00293.

BIAO ZHANG, XIMING WANG, YIFAN XU, et al. Multi-domain collaborative anti-jamming based on multi-agent deep reinforcement learning. [J]. Chinese journal on internet of things, 2022, 6(4): 104-116. DOI： 10.11959/j.issn.2096-3750.2022.00293.

摘要

动态的传输需求和有限的缓存空间给恶意干扰环境下的无线数据传输带来巨大挑战。针对上述问题，从频域和时域的角度出发，研究了面向分布式物联网的协同抗干扰信道选择和数据调度联合决策方法，构建了基于多用户马尔可夫决策过程的数据传输模型，提出了基于多智能体深度强化学习的协同抗干扰信道和数据联合决策算法。仿真表明，所提算法可有效避开恶意干扰并避免同频互扰。相较于对比算法，网络吞吐量显著提高，丢包数量明显降低。

Abstract

Dynamic transmission requirements and the limited cache space bring great challenges to wireless data transmission in the malicious jamming environment.Aiming at the above problems

a collaborative anti-jamming channel selection and data scheduling joint decision method for distributed internet of things was studied from the perspective of frequency domain and time domain.A data transmission model based on multi-user Markov decision process was constructed and a collaborativeanti-jamming joint-channel-and-data decision algorithm based on multi-agent deep reinforcement learning was proposed.Simulation results show that the proposed algorithm can effectively avoid the malicious jamming and the co-channel interference.Compared with the comparison algorithm

the network throughput is significantly improved

and the number of packet dropout is significantly reduced.

关键词

协同抗干扰信道选择数据调度多智能体强化学习深度学习

Keywords

collaborative anti-jammingchannel selectiondata schedulingmulti-agent reinforcement learningdeep learning

references

CHOWDHURY M Z, SHAHJALAL M, AHMED S ,et al. 6G wireless communication systems:applications,requirements,technologies,challenges,and research directions[J]. IEEE Open Journal of the Communications Society, 2020(1): 957-975.

ZHANG L, LIANG Y C, NIYATO D . 6G Visions:mobile ultra-broadband,super internet of things,and artificial intelligence[J]. China Communications, 2019,16(8): 1-14.

AL-FUQAHA A, GUIZANI M, MOHAMMADI M ,et al. Internet of things:a survey on enabling technologies,protocols,and applications[J]. IEEE Communications Surveys ＆Tutorials, 2015,17(4): 2347-2376.

PIRAYESH H, ZENG H C . Jamming attacks and anti-jamming strategies in wireless networks:a comprehensive survey[J]. IEEE Communications Surveys ＆ Tutorials, 2022,24(2): 767-809.

KARAGIANNIS D, ARGYRIOU A . Jamming attack detection in a pair of RF communicating vehicles using unsupervised machine learning[J]. Vehicular Communications, 2018,13: 56-63.

王海超, 王金龙, 丁国如 ,等. 空天地一体化网络中智能协同抗干扰技术[J]. 指挥与控制学报, 2020,6(3): 185-191.

WANG H C, WANG J L, DING G R ,et al. Intelligent cooperative anti-jamming technology in space-air-ground integrated networks[J]. Journal of Command and Control, 2020,6(3): 185-191.

冉雨, 程郁凡, 陈大勇 ,等. 采用BP神经网络的智能抗干扰决策引擎研究[J]. 信号处理, 2019,35(8): 1350-1357.

RAN Y, CHENG Y F, CHEN D Y ,et al. Intelligent anti-jamming decision engine based on BP neural network[J]. Journal of Signal Processing, 2019,35(8): 1350-1357.

KONG L J, XU Y H, ZHANG Y L ,et al. A reinforcement learning approach for dynamic spectrum anti-jamming in fading environment[C]// Proceedings of 2018 IEEE 18th International Conference on Communication Technology. Piscataway:IEEE Press, 2018: 51-58.

PEI X F, WANG X M, YAO J N ,et al. Joint time-frequency anti-jamming communications:a reinforcement learning approach[C]// Proceedings of 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP). Piscataway:IEEE Press, 2019: 1-6.

HAN H, WANG X M, GU F L ,et al. Better late than never:GAN-enhanced dynamic anti-jamming spectrum access with incomplete sensing information[J]. IEEE Wireless Communications Letters, 2021,10(8): 1800-1804.

XIAO L, WAN X Y, LU X Z ,et al. IoT security techniques based on machine learning:how do IoT devices use AI to enhance security?[J]. IEEE Signal Processing Magazine, 2018,35(5): 41-49.

LIU X, XU Y H, JIA L L ,et al. Anti-jamming communications using spectrum waterfall:a deep reinforcement learning approach[J]. IEEE Communications Letters, 2018,22(5): 998-1001.

XU Y F, XU Y H, REN G C ,et al. Play it by ear:context-aware distributed coordinated anti-jamming channel access[J]. IEEE Transactions on Information Forensics and Security, 2021,16: 5279-5293.

XU Y F, XU Y H, DONG X ,et al. Convert harm into benefit:a coordination-learning based dynamic spectrum anti-jamming approach[J]. IEEE Transactions on Vehicular Technology, 2020,69(11): 13018-13032.

XU Y F, REN G C, CHEN J ,et al. A one-leader multi-follower Bayesian-stackelberg game for anti-jamming transmission in UAV communication networks[J]. IEEE Access, 2018(6): 21697-21709.

YAO F Q, JIAL L . A collaborative multi-agent reinforcement learning anti-jamming algorithm in wireless networks[J]. IEEE Wireless Communications Letters, 2019,8(4): 1024-1027.

WANG X M, XU Y H, CHEN J ,et al. Mean field reinforcement learning based anti-jamming communications for ultra-dense Internet of Things in 6G[C]// Proceedings of 2020 International Conference on Wireless Communications and Signal Processing (WCSP). Piscataway:IEEE Press, 2020: 195-200.

ELLEUCH I, POURRANJBAR A, KADDOUM G . A novel distributed multi-agent reinforcement learning algorithm against jamming attacks[J]. IEEE Communications Letters, 2021,25(10): 3204-3208.

LI W, XU Y H, GUO Q J ,et al. A Q-learning-based channel selection and data scheduling approach for high-frequency communications in jamming environment[C]// Machine Learning and Intelligent Communications, 2019: 145-160.

WANG X M, CHEN X Q, WANG M ,et al. Decentralized reinforcement learning based anti-jamming communication for self-organizing networks[C]// Proceedings of 2021 IEEE Wireless Communications and Networking Conference. Piscataway:IEEE Press, 2021: 1-6.

PEI X F, WANG X M, RUAN L ,et al. Joint power and channel selection for anti-jamming communications:a reinforcement learning approach[C]// Machine Learning and Intelligent Communications, 2019: 551-562.

XUE C J, . Anti-interference performance of multi-path direct sequence spread spectrum wireless communication system[C]// Proceedings of 2010 International Conference on E-Health Networking Digital Ecosystems and Technologies (EDT). Piscataway:IEEE Press, 2010(1): 461-464.

ORORBIA M E, WARN G P . Design synthesis through a Markov decision process and reinforcement learning framework[J]. Journal of Computing and Information Science in Engineering, 2022,22(2): 021002.

FEINBERG V, WAN A, STOICA I ,et al. Model-based value estimation for efficient model-free reinforcement learning[EB]. 2018.

MNIH V, KAVUKCUOGLU K, SILVER D ,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015,518(7540): 529-533.

HE K M, SUN J . Convolutional neural networks at constrained time cost[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015: 5353-5360.

ZHANG X B, WANG H, RUAN L ,et al. Joint channel,power and bandwidth optimization for anti-jamming communications:amulti-agent Q-learning approach[C]// Proceedings of 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP). Piscataway:IEEE Press, 2021: 1-6.

陈昕, 徐彤, 向旭东 ,等. 具有并行信道的认知无线网络性能评价研究[J]. 计算机研究与发展, 2013,50(10): 2126-2132.

CHEN X, XU T, XIANG X D ,et al. Performance evaluation of cognitive radio networks with parallel channels[J]. Journal of Computer Research and Development, 2013,50(10): 2126-2132.

LI J, HAN Y . Optimal resource allocation for packet delay minimization in multi-layer UAV networks[J]. IEEE Communications Letters, 2017,21(3): 580-583.

KAWABATA A, CHATTERJEE B C, BA S ,et al. A real-time delay-sensitive communication approach based on distributed processing[J]. IEEE Access, 2017(5): 20235-20248.

浏览量

238

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于AE和Transformer的运动想象脑电信号分类研究

一种基于深度可分离卷积和注意力机制的入侵检测方法

物联网环境中基于深度学习的差分隐私预算优化方法

基于Res-DNN的端到端MIMO系统信号检测算法

基于硬件仿真系统的边缘计算人工智能视觉芯片设计验证