基于连续动作空间深度强化学习的多数据融合室内定位方法

陈雪晨; 易嘉旋; 王霭祥; 邓晓衡

doi:10.11959/j.issn.2096-3750.2024.00358

您当前的位置：

首页 >

文章列表页 >

基于连续动作空间深度强化学习的多数据融合室内定位方法

理论与技术 | 更新时间：2024-06-05

- 基于连续动作空间深度强化学习的多数据融合室内定位方法
- Multi-data fusionaided indoor localization based on continuous action space deep reinforcement learning
- 物联网学报 2024年8卷第1期页码：40-48
- 作者机构：
  
  1. 中南大学电子信息学院，湖南长沙 410004
  2. 中南大学计算机学院，湖南长沙 410083
- 作者简介：
  
  [ "陈雪晨（1984‒ ），女，中南大学电子信息学院副教授，主要研究方向为无线通信理论及系统、室内智能定位" ]
  [ "易嘉旋（1999‒ ），男，中南大学电子信息学院硕士生，主要研究方向为室内智能定位、无线通信、人工智能" ]
  [ "王霭祥（2000‒ ），男，中南大学计算机学院硕士生，主要研究方向为联邦学习、无线通信、室内定位" ]
  [ "邓晓衡（1974‒ ），男，中南大学电子信息学院教授、院长，主要研究方向为无线网络与边缘计算、物联网与大数据、智能车联网、分布式计算与系统" ]
- 基金信息：
  
  国家自然科学基金项目;The National Natural Science Foundation of China(62172441);四川省重点研发计划;The Key Research and Development Pro-gram of Sichuan Province(2023YFG0120)
- DOI：10.11959/j.issn.2096-3750.2024.00358
  中图分类号： TN915.08
- 纸质出版日期：2024-03-30，
  
  网络出版日期：2024-03，
- 稿件说明：
移动端阅览
陈雪晨, 易嘉旋, 王霭祥, 等. 基于连续动作空间深度强化学习的多数据融合室内定位方法[J]. 物联网学报, 2024,8(1):40-48.

XUECHEN CHEN, JIAXUAN YI, AIXIANG WANG, et al. Multi-data fusionaided indoor localization based on continuous action space deep reinforcement learning. [J]. Chinese journal on internet of things, 2024, 8(1): 40-48.
陈雪晨, 易嘉旋, 王霭祥, 等. 基于连续动作空间深度强化学习的多数据融合室内定位方法[J]. 物联网学报, 2024,8(1):40-48. DOI： 10.11959/j.issn.2096-3750.2024.00358.

XUECHEN CHEN, JIAXUAN YI, AIXIANG WANG, et al. Multi-data fusionaided indoor localization based on continuous action space deep reinforcement learning. [J]. Chinese journal on internet of things, 2024, 8(1): 40-48. DOI： 10.11959/j.issn.2096-3750.2024.00358.

摘要

基于智能手机的室内定位在研究和工业领域都引起了相当大的关注。然而在复杂的定位环境中，定位的准确性和鲁棒性仍然是具有挑战性的问题。考虑到行人航位推算（PDR

pedestrian dead reckoning）算法被广泛配备在最近的智能手机上，提出了一种基于双延迟深度确定性策略梯度（TD3

twin delayed deep deterministic policy gradient）的室内定位融合方法，该方法集成了Wi-Fi信息和PDR数据，将PDR的定位过程建模为马尔可夫过程并引入了智能体的连续动作空间。最后，与3个最先进的深度Q网络（DQN

deep Q network）室内定位方法进行实验。实验结果表明，该方法能够显著减少定位误差，提高定位准确性。

Abstract

Significant attention has been paid to indoor localization using smartphones in both research and industry.However

the accuracy and robustness of localization remain challenging issues

particularly in complex indoor environments.In light of the prevalent incorporation of pedestrian dead reckoning (PDR) devices in contemporary smartphones

an advanced indoor localization fusion method

anchored in the twin delayed deep deterministic policy gradient (TD3) framework

was proposed.In this approach

a seamless integration of Wi-Fi information and PDR data was achieved.The localization process of PDR was modeled as a Markov process

and a comprehensive continuous action space was introduced for the agent.To evaluate the performance of the proposed method

experiments were conducted and this approach was compared with three state-of-the-art deep Q network (DQN) based indoor localization methods.The experimental results demonstrate that the proposed method significantly reduces localization errors and enhances overall localization accuracy.

关键词

Wi-Fi行人航位推算室内定位双延迟深度确定性策略梯度深度强化学习

Keywords

Wi-Fipedestrian dead reckoningindoor localizationtwin delayed deep deterministic policy gradientdeep reinforcement learning

references

LIU M, WANG H J, YANG Y ,et al. RFID 3-D indoor localization for tag and tag-free target based on interference[J]. IEEE Transactions on Instrumentation and Measurement, 2019,68(10): 3718-3732.

WU R J, PIKE M, CHAI X Q,etal . GA-PDR:using gait analysis for heading estimation in PDR based indoor localization system[C]// Proceedings of the IECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society. Piscataway:IEEE Press, 2023: 1-6.

GHAOUI M A, VINCKE B, REYNAUD R . Human motion likelihood representation map-aided PDR particle filter[J]. IEEE Sensors Journal, 2023,23(1): 484-494.

SUN W, XUE M, YU H S ,et al. Augmentation of fingerprints for indoor WiFi localization based on Gaussian process regression[J]. IEEE Transactions on Vehicular Technology, 2018,67(11): 10896-10905.

LI Z, RAO X P . Toward long-term effective and robust device-free indoor localization via channel state information[J]. IEEE Internet of Things Journal, 2022,9(5): 3599-3611.

GAO B, YANG F, CUI N ,et al. A federated learning framework for fingerprinting-based indoor localization in multibuilding and multi-floor environments[J]. IEEE Internet of Things Journal, 2023,10(3): 2615-2629.

KIM M, HAN D, RHEE J K K . Multiview variational deep learning with application to practical indoor localization[J]. IEEE Internet of Things Journal, 2021,8(15): 12375-12383.

ZOU H, CHEN C L, LI M X ,et al. Adversarial learning-enabled automatic WiFi indoor radio map construction and adaptation with mobile robot[J]. IEEE Internet of Things Journal, 2020,7(8): 6946-6954.

YU D, LI C G . An accurate WiFiindoor positioning algorithm for complex pedestrian environments[J]. IEEE Sensors Journal, 2021,21(21): 24440-24452.

LIU S Y, DE LACERDA R, FIORINA J . WKNN indoor Wi-Fi localization method using k-means clustering based radio mapping[C]// Proceedings of the 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring). Piscataway:IEEE Press, 2021: 1-5.

ZHANG M Y, JIA J, CHEN J ,et al. Indoor localization fusing WiFi with smartphone inertial sensors using LSTM networks[J]. IEEE Internet of Things Journal, 2021,8(17): 13608-13623.

LIANG Q, LIU M . An automatic site survey approach for indoor localization using a smartphone[J]. IEEE Transactions on Automation Science and Engineering, 2020,17(1): 191-206.

LIU R, MARAKKALAGE S H, PADMAL M ,et al. Collaborative SLAM based on WiFi fingerprint similarity and motion information[J]. IEEE Internet of Things Journal, 2020,7(3): 1826-1840.

ZHAO Y H, ZHANGZ X, FENG T Y ,et al. GraphIPS:calibrationfree and map-free indoor positioning using smartphone crowdsourced data[J]. IEEE Internet of Things Journal, 2021,8(1): 393-406.

DU X Q, LIAO X W, LIU M M ,et al. CRCLoc:a crowdsourcingbased radio map construction method for WiFi fingerprinting localization[J]. IEEE Internet of Things Journal, 2022,9(14): 12364-12377.

ZHOU Y, MA X Y, HU S T ,et al. QoE-driven adaptive deployment strategy of multi-UAV networks based on hybrid deep reinforcement learning[J]. IEEE Internet of Things Journal, 2022,9(8): 5868-5881.

ZHAO Y J, MA Y, HU S L . USV formation and path-following control via deep reinforcement learning with random braking[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021,32(12): 5468-5478.

OUBBATI O S, ATIQUZZAMAN M, BAZ A ,et al. Dispatch of UAVs for urban vehicular networks:adeep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2021,70(12): 13174-13189.

CHU N H, HOANGD T, NGUYEN D N ,et al. Joint speed control and energy replenishment optimization for UAV-assisted IoT data collection with deep reinforcement transfer learning[J]. IEEE Internet of Things Journal, 2023,10(7): 5778-5793.

LIU Z, LIU Q M, TANG L ,et al. Visuomotor reinforcement learning for multirobot cooperative navigation[J]. IEEE Transactions on Automation Science and Engineering, 2022,19(4): 3234-3245.

ZHU W, GUO X, OWAKI D ,et al. A survey of sim-to-real transfer techniques applied to reinforcement learning for bioinspired robots[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023,34(7): 3444-3459.

CHEN L, WANG Y N, MIAO Z Q ,et al. Transformer-based imitative reinforcement learning for multirobot path planning[J]. IEEE Transactions on Industrial Informatics, 2023,19(10): 10233-10243.

HAN J I, LEE J H, CHOI H S ,et al. Policy design for an ankle-foot orthosis using simulated physical human-robot interaction via deep reinforcement learning[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering:a Publication of the IEEE Engineering in Medicine and Biology Society, 2022,30: 2186-2197.

BENADDI H, IBRAHIMI K, BENSLIMANE A ,et al. Robust enhancement of intrusion detection systems using deep reinforcement learning and stochastic game[J]. IEEE Transactions on Vehicular Technology, 2022,71(10): 11089-11102.

OH I, RHO S, MOON S ,et al. Creating pro-level AI for a realtime fighting game using deep reinforcement learning[J]. IEEE Transactions on Games, 2022,14(2): 212-220.

XU P, YIN Q Y, ZHANG J G ,et al. Deep reinforcement learning with part-aware exploration bonus in video games[J]. IEEE Transactions on Games, 2022,14(4): 644-653.

DOU F, LU J, XU T Y ,et al. A bisection reinforcement learning approach to 3-D indoor localization[J]. IEEE Internet of Things Journal, 2021,8(8): 6519-6535.

MOHAMMADI M, AL-FUQAHA A, GUIZANI M ,et al. Semisupervised deep reinforcement learning in support of IoT and smart city services[J]. IEEE Internet of Things Journal, 2018,5(2): 624-635.

LI Q, LIAO X W, GAO Z Z . An enhanced direction calibration based on reinforcement learning for indoor localization system[C]// Proceedings of the 2020 IEEE Wireless Communications and Networking Conference (WCNC). Piscataway:IEEE Press, 2020: 1-6.

MNIH V, KAVUKCUOGLU K, SILVER D ,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015,518: 529-533.

SILVER D, HUANG A, MADDISON C J ,et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016,529: 484-489.

FUJIMOTO S, VAN HOOF H, MEGER D . Addressing function approximation error in actor-critic methods[EB/OL]. 2018:arXiv:1802.09477. http://arxiv.org/abs/1802.09477.pdfhttp://arxiv.org/abs/1802.09477.pdf.

QIAN Y X, CHEN X C . An improved particle filter based indoor tracking system via joint Wi-Fi/PDR localization[J]. Measurement Science and Technology, 2021,32(1): 014004.

LI Y, HU X, ZHUANG Y ,et al. Deep reinforcement learning (DRL):another perspective for unsupervised wireless localization[J]. IEEE Internet of Things Journal, 2020,7(7): 6279-6287.

LEI P, LI Y, YUAN L ,et al. An improved wifi fingerprint location method for indoor positioning[C]// Proceedings of the 2022 China Automation Congress (CAC). Piscataway:IEEE, 2022: 423-427.

LIU S, LACERDA R D, FIORINA J . Performance analysis of adaptive k for weighted k-earest neighbor based indoor positioning[C]// Proceedings of the 2022 IEEE 95th Vehicular Technology Conference:(VTC2022-Spring). Piscataway:IEEE, 2022: 1-5.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于能量收集技术的协作卸载计算方案

基于饱和TCP流量的Wi-Fi级联吞吐量模型

Wi-freshness：基于CSI的猪肉新鲜度检测系统研究

基于深度强化学习的SDN服务质量智能优化算法

基于深度强化学习的智能车间调度方法研究