基于DDPG的智能反射面辅助无线携能通信系统性能优化

罗丽平; 潘伟民

doi:10.11959/j.issn.2096-3750.2024.00389

您当前的位置：

首页 >

文章列表页 >

基于DDPG的智能反射面辅助无线携能通信系统性能优化

理论与技术 | 更新时间：2024-08-16

- 基于DDPG的智能反射面辅助无线携能通信系统性能优化
- DDPG-based performance optimization algorithm for IRS-assisted simultaneous wireless information and power transfer systems
- 物联网学报 2024年8卷第2期页码：46-55
- 作者机构：
  
  广西民族大学电子信息学院，广西南宁 530006
- 作者简介：
  
  [ "罗丽平（1980‒ ），女，博士，广西民族大学电子信息学院教授、博士生导师，主要研究方向为新一代无线通信技术。" ]
  [ "潘伟民（1999‒ ），男，广西民族大学电子信息学院硕士生，主要研究方向为智能反射面、无线携能通信和深度强化学习技术。" ]
- 基金信息：
  
  广西科技重大专项(AA23073006);广西民族大学研究生创新计划(gxun-chxs2022298)
- DOI：10.11959/j.issn.2096-3750.2024.00389
  中图分类号：
- 纸质出版日期：2024-06-10，
  
  收稿日期：2024-01-15，
  
  修回日期：2024-05-09，
- 稿件说明：
移动端阅览
罗丽平,潘伟民.基于DDPG的智能反射面辅助无线携能通信系统性能优化[J].物联网学报,2024,08(02):46-55.

LUO Liping,PAN Weimin.DDPG-based performance optimization algorithm for IRS-assisted simultaneous wireless information and power transfer systems[J].Chinese Journal on Internet of Things,2024,08(02):46-55.
罗丽平,潘伟民.基于DDPG的智能反射面辅助无线携能通信系统性能优化[J].物联网学报,2024,08(02):46-55. DOI： 10.11959/j.issn.2096-3750.2024.00389.

LUO Liping,PAN Weimin.DDPG-based performance optimization algorithm for IRS-assisted simultaneous wireless information and power transfer systems[J].Chinese Journal on Internet of Things,2024,08(02):46-55. DOI： 10.11959/j.issn.2096-3750.2024.00389.

摘要

针对智能反射面（IRS

intelligent reflecting surface）辅助的多输入单输出（MISO

multiple input single-output）无线携能通信（SWIPT

simultaneous wireless information and power transfer）系统，考虑基站最大发射功率、IRS反射相移矩阵的单位膜约束和能量接收器的最小能量约束，以最大化信息传输速率为目标，联合优化了基站处的波束成形向量和智能反射面的反射波束成形向量。为解决非凸优化问题，提出了一种基于深度强化学习的深度确定性策略梯度（DDPG

deep deterministic policy gradient）算法。仿真结果表明，DDPG算法的平均奖励与学习率有关，在选取合适的学习率的条件下，DDPG算法能获得与传统优化算法相近的平均互信息，但运行时间明显低于传统的非凸优化算法，即使增加天线数和反射单元数，DDPG算法依然可以在较短的时间内收敛。这说明DDPG算法能有效地提高计算效率，更适合实时性要求较高的通信业务。

Abstract

For the intelligent reflecting surface (IRS)-assisted multiple input single output (MISO) simultaneous wireless information and power transfer (SWIPT) system

the beam forming vector at the base station and the reflected beam forming vector of the IRS were jointly optimized

by considering the maximum transmit power of the base station

the unit modulus constraint of the IRS reflection phase shift matrix

and the minimum energy constraint of the energy receiver. The object was to maximize the spectrum efficiency. To solve the non-convex optimization problem

a deep deterministic policy gradient (DDPG) algorithm based on deep reinforcement learning was proposed. Simulation results show that the average reward of the DDPG algorithm is related to the learning rate. Under the condition of selecting the appropriate learning rate

the DDPG algorithm can obtain an average mutual information similar to that of the traditional optimization algorithm

but the running time is significantly lower than that of the traditional non-convex optimization algorithm. Even if the number of antennas and the number of reflective units are increased

the DDPG algorithm can still converge in a short period of time. This indicates that the DDPG algorithm can effectively improve the computational efficiency and is suitable for communication services with high real-time requirements.

关键词

多输入单输出无线携能通信智能反射面波束成形深度确定性策略梯度

Keywords

multiple input single outputsimultaneous wireless information and power transferintelligent reflecting surfacebeam formingdeep deterministic policy gradient

references

齐峰, 岳殿武, 孙玉. 面向6G的智能反射面无线通信综述[J].移动通信, 2022, 46(4): 65-73.

QI F, YUE D W, SUN Y. A survey of intelligent reflecting surface wireless communications toward 6G[J]. Mobile Communications, 2022,46 (4): 65-73.

WU Q, ZHANG R. Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network[J]. IEEE Communications Magazine, 2020, 58(1): 106-112.

朱政宇, 王梓晅, 徐金雷, 等. 智能反射面辅助的未来无线通信：现状与展望[J]. 航空学报, 2022, 43 (2): 203-217.

ZHU Z Y, WANG Z X, XU J L, et al. Future wireless communication assisted by intelligent reflecting surface: state of art and prospects [J].Acta Aeronautica et Astronautica Sinica, 2022,43 (2) : 203-217.

ALLAHZADEH S, DANESHIFAR E. Simultaneous wireless information and power transfer optimization via alternating convex-concave procedure with imperfect channel state information[J]. Signal Processing, 2021, 182: 107953.

VARSHNEY L R. Transporting information and energy simultaneously[C]//Proceedings of the 2008 IEEE International Symposium on Information Theory. Piscataway: IEEE Press, 2008: 1612-1616.

ZHANG R, HO C K. MIMO broadcasting for simultaneous wireless information and power transfer[J]. IEEE Transactions on Wireless Communications, 2013, 12(5): 1989-2001.

XU J, LIU L, ZHANG R. Multiuser MISO beamforming for simultaneous wireless information and power transfer[J]. IEEE Transactions on Signal Processing, 2014, 62(18): 4798-4810.

XIANG Z, TAO M. Robust beamforming for wireless information and power transmission[J]. IEEE Wireless Communications Letters, 2012, 1(4): 372-375.

马柱华, 罗丽平. 非理想顺序干扰消除和信道状态信息下SWIPT-NOMA-CR网络中断性能[J]. 物联网学报, 2023, 7 (1): 129-139.

MA Z H, LUO L P. Outage performance of SWIPT-NOMA-CR network with imperfect SIC and CSI[J]. Chinese Journal on Internet of Things, 2023, 7(1): 129-139.

王玉俊, 罗丽平. 基于无线携能和非正交多址接入的认知中继网络中断性能分析[J]. 中山大学学报(自然科学版)(中英文), 2023, 62 (1): 169-180.

WANG Y J, LUO L P. Outage performance analysis for cognitive relay networks based on SWIPT and NOMA[J]. Acta Scientiarum Naturalium Universitatis Sunyatseni, 2023, 62 (1): 169-180.

TANG Y, MA G, XIE H, et al. Joint transmit and reflective beamforming design for IRS-assisted multiuser MISO SWIPT systems[C]//Proceedings of the ICC 2020-2020 IEEE International Conference on Communications (ICC). Piscataway: IEEE Press, 2020: 1-6.

LIU Z, ZHU X, CHEN B, et al. Joint transmission design for IRS-assisted MISO SWIPT systems[J]. Signal Process, 2022, 200: 108649.

SHI Q, LIU L, XU W, et al. Joint transmit beamforming and receive power splitting for MISO SWIPT systems[J]. IEEE Transactions on Wireless Communications, 2014, 13(6): 3269-3280.

ZHU Z, XU J, SUN G, et al. Robust beamforming design for IRS-aided secure SWIPT terahertz systems with non-linear EH model[J]. IEEE Wireless Communications Letters, 2022, 11(4): 746-750.

ZHU Z, LI Z, CHU Z, et al. Intelligent reflecting surface-assisted wireless powered heterogeneous networks[J]. IEEE Transactions on Wireless Communications, 2023, 22(12):9881-9892.

ZHU Z, MA M, SUN G, et al. Secrecy rate optimization in nonlinear energy harvesting model-based mmWave IoT systems with SWIPT[J]. IEEE Systems Journal, 2022, 16(4): 5939-5949.

NIU H, CHU Z, ZHOU F, et al. Robust design for intelligent reflecting surface-assisted secrecy SWIPT network[J]. IEEE Transactions on Wireless Communications, 2022, 21(6): 4133-4149.

ZHU Z, LI Z, CHU Z, et al. Resource allocation for intelligent reflecting surface assisted wireless powered IoT systems with power splitting[J]. IEEE Transactions on Wireless Communications, 2022, 21(5): 2987-2998.

朱政宇, 徐金雷, 孙钢灿, 等. 基于IRS辅助的SWIPT物联网系统安全波束成形设计[J]. 通信学报, 2021, 42 (4): 185-193.

ZHU Z Y, XU J L, SUN G C, et al. Secure beamforming design for IRS-assisted SWIPT Internet of things system[J]. Journal on Communications, 2021, 42 (4):185-193.

LEE H, LEE K J, KIM H, et al. Joint transceiver optimization for MISO SWIPT systems with time switching[J]. IEEE Transactions on Wireless Communications, 2018, 17(5): 3298-3312.

ATAPATTU S, FAN R, DHARMAWANSA P, et al. Reconfigurable intelligent surface assisted two-way communications: performance analysis and optimization[J]. IEEE Transactions on Communications, 2020, 68(10): 6552-6567.

THRUN S, LITTMAN M L. Reinforcement learning: an introduction[J]. AI Magazine, 2000, 21(1): 103-103.

GOODFELLOW I, BENGIO Y, COURVILLE A. Deep learning[M]. Cambridge: MIT press, 2016.

MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.

HUANG C, ALEXANDROPOULOS G C, ZAPPONE A, et al. Deep learning for UL/DL channel calibration in generic massive MIMO systems[C]//Proceedings of the ICC 2019-2019 IEEE International Conference on Communications (ICC). Piscataway: IEEE Press, 2019: 1-6.

LUONG N C, HOANG D T, GONG S, et al. Applications of deep reinforcement learning in communications and networking: a survey[J]. IEEE Communications Surveys & Tutorials, 2019, 21(4): 3133-3174.

FENG K, WANG Q, LI X, et al. Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems[J]. IEEE Wireless Communications Letters, 2020, 9(5): 745-749.

HUANG C, MO R, YUEN C. Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(8): 1839-1850.

SHEHAB M, CIFTLER B S, KHATTAB T, et al. Deep reinforcement learning powered IRS-assisted downlink NOMA[J]. IEEE Open Journal of the Communications Society, 2022, 3: 729-739.

LIN J, ZOU Y, DONG X, et al. Deep reinforcement learning for robust beamforming in IRS-assisted wireless communications[C]//Proceedings of the GLOBECOM 2020-2020 IEEE Global Communications Conference. Piscataway: IEEE Press, 2020: 1-6.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于硬件损伤和非完美CSI的IRS辅助NOMA网络鲁棒传输算法

非理想顺序干扰消除和信道状态信息下SWIPT-NOMA-CR网络中断性能