浏览全部资源
扫码关注微信
1.清华大学深圳国际研究生院,广东 深圳 518000
2.鹏城实验室,广东 深圳 518000
3.RISC-V国际开源实验室,广东 深圳 518000
[ "肖子健(1997‒ ),男,清华大学深圳国际研究生院硕士生,主要研究方向为多智能体协同、强化学习等。" ]
[ "夏晨钧(1996‒ ),男,清华大学深圳国际研究生院硕士生,主要研究方向为人机交互、移动传感、强化学习等。与肖子健为共同第一作者" ]
[ "徐杨罡(1999‒ ),男,清华大学深圳国际研究生院硕士生,主要研究方向为强化学习、大模型智能体、无线通信等。" ]
[ "任纪媛(2000‒ ),女,清华大学深圳国际研究生院硕士生,主要研究方向为强化学习、移动传感等。" ]
[ "陈鑫磊(1987‒ ),男,博士,清华大学深圳国际研究生院副教授,主要研究方向为智能物联网、多智能体协同、强化学习、普适计算、脑机接口等。" ]
收稿日期:2024-09-08,
修回日期:2024-09-20,
纸质出版日期:2024-09-10
移动端阅览
肖子健,夏晨钧,徐杨罡等.无人机集群协同主动搜索的强化学习策略研究[J].物联网学报,2024,08(03):36-45.
XIAO Zijian,Hsia Chen-Chun,XU Yanggang,et al.Collaborative altitude-adaptive reinforcement learning for active search with unmanned aerial vehicle swarms[J].Chinese Journal on Internet of Things,2024,08(03):36-45.
肖子健,夏晨钧,徐杨罡等.无人机集群协同主动搜索的强化学习策略研究[J].物联网学报,2024,08(03):36-45. DOI: 10.11959/j.issn.2096-3750.2024.00413.
XIAO Zijian,Hsia Chen-Chun,XU Yanggang,et al.Collaborative altitude-adaptive reinforcement learning for active search with unmanned aerial vehicle swarms[J].Chinese Journal on Internet of Things,2024,08(03):36-45. DOI: 10.11959/j.issn.2096-3750.2024.00413.
在多变和复杂的灾害环境中,迅速定位幸存者是一项至关重要的任务,无人机(UAV
unmanned aerial vehicle)群的主动搜索能力在这一过程中发挥着关键作用。然而,无人机的传感器性能与其飞行高度紧密相关,覆盖范围和探测精度难以平衡。为了实现高效的搜索,无人机集群需要在高空飞行以覆盖更广的区域,同时在低空飞行以提高探测的准确性。此时,策略的制定对于无人机集群的协调和决策至关重要。为了应对这些挑战,提出了协同高度自适应强化学习(CARL
collaborative altitude-adaptive reinforcement learning)方法,该方法融合了可变高度传感器模型、基于信心的评估机制以及基于近端策略优化(PPO
proximal policy optimization)的高度自适应规划器。通过CARL方法,无人机能够根据实时情况动态地调整感知策略,并做出更加明智的决策。此外,引入了一种创新的奖励塑造策略,从而在广阔环境中最大化搜索效率。通过在多种条件下的模拟测试,CARL方法在提高完全搜索率方面表现出色,相较于基线方法提升了12%,充分证明了其在提升无人机集群在主动搜索任务中的有效性。
Active search with unmanned aerial vehicle (UAV) swarms in cluttered and unpredictable environments poses a critical challenge in search and rescue missions
where the rapid localizations of survivors are of paramount importance
as the majority of urban disaster victims are surface casualties. However
the altitude-dependent sensor performance of UAV introduces a crucial trade-off between coverage and accuracy
significantly influencing the coordination and decision-making of UAV swarms. The optimal strategy has to strike a balance between exploring larger areas at higher altitudes and exploiting regions of high target probability at lower altitudes. To address these challenges
collaborative altitude-adaptive reinforcement learning (CARL) was proposed which incorporated an altitude-aware sensor model
a confidence-informed assessment module
and an altitude-adaptive planner based on proximal policy optimization (PPO) algorithms. CARL enabled UAV to dynamically adjust their sensing location and made informed decisions. Furthermore
a tailored reward shaping strategy was introduced
which maximized search efficiency in extensive environments. Comprehensive simulations under diverse conditions demonstrate that CARL surpasses baseline methods
achieves a 12% improvement in full recovery rate
and showcase its potential for enhancing the effectiveness of UAV swarms in active search missions.
HAKAMI A , KUMAR A , SHIM S J , et al . Application of soft systems methodology in solving disaster emergency logistics problems [J ] . International Journal of Industrial and Manufacturing Engineering , 2013 , 7 ( 12 ): 2470 - 2477 .
MEERA A A , POPOVIĆ M , MILLANE A , et al . Obstacle-aware adaptive informative path planning for UAV-based target search [C ] // Proceedings of the 2019 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE Press , 2019 : 718 - 724 .
LYU M Y , ZHAO Y B , HUANG C , et al . Unmanned aerial vehicles for search and rescue: a survey [J ] . Remote Sensing , 2023 , 15 ( 13 ): 3266 .
POPOVIĆ M , VIDAL-CALLEJA T , HITZ G , et al . Multi-resolution mapping and informative path planning for UAV-based terrain monitoring [C ] // Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE Press , 2017 : 1382 - 1388 .
VISERAS A , GARCIA R . DeepIG: multi-robot information gathering with deep reinforcement learning [J ] . IEEE Robotics and Automation Letters , 2019 , 4 ( 3 ): 3059 - 3066 .
WANG H Y , XU J G , ZHAO C Y , et al . TransformLoc: transforming MAVs into mobile localization infrastructures in heterogeneous swarms [J ] . arXiv preprint , 2024 , arXiv: 2403.08815 .
WANG H Y , CHEN X C , CHENG Y H , et al . H-SwarmLoc: efficient scheduling for localization of heterogeneous MAV swarm with deep reinforcement learning [C ] // Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems . New York : ACM , 2022 : 1148 - 1154 .
CHEN X C , WANG H Y , LI Z X , et al . DeliverSense: efficient delivery drone scheduling for crowdsensing with deep reinforcement learning [C ] // Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing . New York : ACM , 2022 : 403 - 408 .
REN J Y , XU Y G , LI Z X , et al . Scheduling UAV swarm with attention-based graph reinforcement learning for ground-to-air heterogeneous data communication [C ] // Proceedings of the Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing . New York : ACM , 2023 : 670 - 675 .
CHEN X C , XIAO Z J , CHENG Y H , et al . SOScheduler: toward proactive and adaptive wildfire suppression via multi-UAV collaborative scheduling [J ] . IEEE Internet of Things Journal , 2024 , 11 ( 14 ): 24858 - 24871 .
REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: unified, real-time object detection [C ] // Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE Press , 2016 : 779 - 788 .
GHODS R , DURKIN W J , SCHNEIDER J . Multi-agent active search using realistic depth-aware noise model [C ] // Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE Press , 2021 : 9101 - 9108 .
GOEL H , LIPSCHITZ L J , AGARWAL S , et al . Reinforcement learning for agile active target sensing with a UAV [J ] . arXiv preprint , 2022 , arXiv: 2212.08214 .
ALAGHA A , SINGH S , MIZOUNI R , et al . Target localization using multi-agent deep reinforcement learning with proximal policy optimization [J ] . Future Generation Computer Systems , 2022 , 136 : 342 - 357 .
MACWAN A , VILELA J , NEJAT G , et al . A multirobot path-planning strategy for autonomous wilderness search and rescue [J ] . IEEE Transactions on Cybernetics , 2015 , 45 ( 9 ): 1784 - 1797 .
TOMIC T , SCHMID K , LUTZ P , et al . Toward a fully autonomous UAV: research platform for indoor and outdoor urban search and rescue [J ] . IEEE Robotics & Automation Magazine , 2012 , 19 ( 3 ): 46 - 56 .
POPOVIĆ M , HITZ G , NIETO J , et al . Online informative path planning for active classification using UAVs [C ] // Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE Press , 2017 : 5753 - 5758 .
ZHENG Y S , CHEN J T . Active search for low-altitude UAV sensing and communication for users at unknown locations [J ] . arXiv preprint , 2024 , arXiv: 2408.14067 .
LI Q Q , TAIPALMAA J , QUERALTA J P , et al . Towards active vision with UAVs in marine search and rescue: analyzing human detection at variable altitudes [C ] // Proceedings of the 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR) . Piscataway : IEEE Press , 2020 : 65 - 70 .
SHAO R , TAO R T , LIU Y D , et al . UAV cooperative search in dynamic environment based on hybrid-layered APF [J ] . EURASIP Journal on Advances in Signal Processing , 2021 , 2021 ( 1 ): 101 .
RÜCKIN J , JIN L R , POPOVIĆ M . Adaptive informative path planning using deep reinforcement learning for UAV-based active sensing [C ] // Proceedings of the 2022 International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE Press , 2022 : 4473 - 4479 .
ZHU H , CHUNG J J , LAWRANCE N R J , et al . Online informative path planning for active information gathering of a 3D surface [C ] // Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE Press , 2021 : 1488 - 1494 .
BONO R N , CARPIO R F , GASPARRI A , et al . Information-driven path planning for UAV with limited autonomy in large-scale field monitoring [J ] . IEEE Transactions on Automation Science and Engineering , 2022 , 19 ( 3 ): 2450 - 2460 .
ZHU K , HAN B , ZHANG T . Multi-UAV distributed collaborative coverage for target search using heuristic strategy [J ] . Guidance, Navigation and Control , 2021 , 1 ( 1 ): 2150002 .
IGOE C , GHODS R , SCHNEIDER J . Multi-agent active search: a reinforcement learning approach [J ] . IEEE Robotics and Automation Letters , 2022 , 7 ( 2 ): 754 - 761 .
BANERJEE A , GHODS R , SCHNEIDER J . Cost aware asynchronous multi-agent active search [J ] . arXiv preprint , 2022 , arXiv: 2210.02259 .
ADONI W , LORENZ S , FAREEDH J , et al . Investigation of autonomous multi-UAV systems for target detection in distributed environment: current developments and open challenges [J ] . Drones , 2023 , 7 ( 4 ): 263 .
JAVAID S , SAEED N , QADIR Z , et al . Communication and control in collaborative UAVs: recent advances and future trends [J ] . IEEE Transactions on Intelligent Transportation Systems , 2023 , 24 ( 6 ): 5719 - 5739 .
Rakhlin A . Convolutional neural networks for sentence classification [J ] . GitHub , 2016 ( 6 ): 25 .
GRAVES A . Long short-term memory [J ] . Supervised Sequence Labelling with Recurrent Neural Networks , 2012 : 37 - 45 .
SCHULMAN J , WOLSKI F , DHARIWAL P , et al . Proximal policy optimization algorithms [J ] . arXiv preprint , 2017 , arXiv: 1707.06347 .
0
浏览量
24
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构