1.北方工业大学信息学院,北京 100144
2.大规模流数据集成与分析技术北京市重点实验室,北京 100144
3.北京市石景山区人民检察院,北京 100043
[ "杨坤峰(2000‒ ),男,北方工业大学信息学院硕士生,主要研究方向为大语言模型、大数据。" ]
[ "丁维龙(1983‒ ),男,博士,北方工业大学信息学院副教授,主要研究方向为时空大数据、多模态大数据人工智能技术和服务计算。" ]
[ "田涵(1998‒ ),男,北方工业大学信息学院硕士生,主要研究方向为大数据、工作流技术、人工智能。" ]
[ "李阳(1988‒ ),女,北京市石景山区人民检察院四级高级检察官,主要研究方向为刑事检察。" ]
[ "王博(1987‒ ),男,北京市石景山区人民检察院三级主任科员,主要研究方向为法律影像表征、意图分析等。" ]
[ "赵广静(1982‒ ),女,北京市石景山区人民检察院四级高级检察官,主要研究方向为刑事检察等。" ]
[ "王晓洁(1991‒ ),女,北京市石景山区人民检察院检察官助理,主要研究方向为法学。" ]
收稿:2024-10-21,
修回:2025-01-23,
纸质出版:2025-12-10
移动端阅览
杨坤峰,丁维龙,田涵等.基于大语言模型面向医保欺诈监督的嫌疑关系提取方法[J].物联网学报,2025,09(04):137-148.
YANG Kunfeng,DING Weilong,TIAN Han,et al.Method for extracting suspect relationship in healthcare insurance fraud supervision based on large language models[J].Chinese Journal on Internet of Things,2025,09(04):137-148.
杨坤峰,丁维龙,田涵等.基于大语言模型面向医保欺诈监督的嫌疑关系提取方法[J].物联网学报,2025,09(04):137-148. DOI: 10.11959/j.issn.2096-3750.2025.00469.
YANG Kunfeng,DING Weilong,TIAN Han,et al.Method for extracting suspect relationship in healthcare insurance fraud supervision based on large language models[J].Chinese Journal on Internet of Things,2025,09(04):137-148. DOI: 10.11959/j.issn.2096-3750.2025.00469.
医疗保险欺诈在全球范围内日益严重,对经济和医疗保健体系构成重大威胁。与被广泛研究关注的开药环节相比,“回流药”收药环节是医保欺诈监督全链条中最隐蔽、难度最大的关键环节。针对该环节,采用了物联网(IoT
Internet of things)中移动群智感知的思想,将嫌疑人的手机视为传感器,从中收集包括微信聊天记录在内的多模态数据,然后对其进行分析挖掘,从而构建出嫌疑人的社交物联网(SIoT
social Internet of things)。然而,在数据处理与分析阶段对微信聊天记录进行数据处理时,现有的关系提取方法面临多模态异构性和嫌疑人交流习惯差异性带来的提取精度困难的问题。针对上述问题,提出了一种基于大语言模型(LLMs
large language models)的嫌疑关系提取方法,首次应用LLMs对“回流药”倒卖流程中嫌疑人的社交软件聊天记录进行嫌疑关系提取,并且面向聊天记录设计了提示词模板和优化模型交互。该方法已应用于北京市石景山区人民检察院的医保欺诈监督业务,在实际数据上的充分实验表明,该方法显著地提升了复杂多模态数据的处理能力以及嫌疑关系提取的准确性与效率,有助于深入挖掘医疗保险欺诈上下游环节中的关键人物节点。
Healthcare insurance fraud is becoming increasingly serious worldwide
posing significant threats to both the economy and the healthcare system. Compared with the widely studied prescription stage
the drug collection stage in the resale of returned drugs is the most covert and challenging key link in the entire fraud supervision chain. To address this issue
the concept of mobile crowd-sensing was adopted in the Internet of things (IoT)
treating the suspect's mobile phone as a sensor to automatically collect multimodal data
including WeChat chat logs
which were then analyzed to construct a social Internet of things (SIoT) for the suspect. However
during the data processing and analysis phase
current relationship extraction methods face challenges in accuracy because of the multimodal heterogeneity of the data and variations in suspects' communication habits. To overcome these challenges
a suspect relationship extraction method based on large language models (LLMs) was proposed
applying LLMs for the first time to extract suspect relationships from social media chat logs in the drug resale process. Prompt templates and interaction optimization for the model were specifically designed for chat logs. This method has been applied in the healthcare insurance fraud supervision activities of the Shijingshan District People's Procuratorate in Beijing. Extensive experiments on the real-world data demonstrate that this approach
leveraging multiple mobile sensing nodes in the SIoT
significantly improves the processing capabilities of complex multimodal data and enhances the accuracy and efficiency of suspect relationship extraction. This method facilitates the in-depth investigation of key individuals involved in the upstream and downstream links of healthcare insurance fraud.
CUI H Y , LI Q Z , LI H , et al . Healthcare fraud detection based on trustworthiness of doctors [C ] // Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA . Piscataway : IEEE Press , 2017 : 74 - 81 .
弭宝瞳 , 梁循 , 张树森 . 社交物联网研究综述 [J ] . 计算机学报 , 2018 , 41 ( 7 ): 1448 - 1475 .
MI B T , LIANG X , ZHANG S S . A survey on social Internet of things [J ] . Chinese Journal of Computers , 2018 , 41 ( 7 ): 1448 - 1475 .
VAIBHAVA LAKSHMI R , DEEPAK G , SANTHANAVIJAYAN A , et al . Search for social smart objects constituting sensor ontology, social IoT and social network interaction [C ] // Proceedings of the 2022 6th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) . Piscataway : IEEE Press , 2022 : 60 - 65 .
DHELIM S , NING H S , FARHA F , et al . IoT-enabled social relationships meet artificial social intelligence [J ] . IEEE Internet of Things Journal , 2021 , 8 ( 24 ): 17817 - 17828 .
AFZAL B , UMAIR M , ASADULLAH SHAH G , et al . Enabling IoT platforms for social IoT applications: vision, feature mapping, and challenges [J ] . Future Generation Computer Systems , 2019 , 92 : 718 - 731 .
SHARAN B , HASSAN M , VANI V D , et al . Machine learning-based fraud detection system for insurance claims in IoT environment [C ] // Proceedings of the 2024 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI) . Piscataway : IEEE Press , 2024 : 1 - 5 .
ISLAM PROVA N N . Healthcare fraud detection using machine learning [C ] // Proceedings of the 2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI) . Piscataway : IEEE Press , 2024 : 1119 - 1123 .
HE H Y , DING W L , ZHANG H , et al . A LLMs-based procuratorial service for detecting drug trafficking on digital forensics data [C ] // Proceedings of the Service-Oriented Computing - ICSOC 2024 Workshops . Singapore : Springer , 2026 : 254 - 258 .
YSTGAARD K F , ATZORI L , PALMA D , et al . Review of the theory, principles, and design requirements of human-centric Internet of Things (IoT) [J ] . Journal of Ambient Intelligence and Humanized Computing , 2023 , 14 ( 3 ): 2827 - 2859 .
宁尚明 , 滕飞 , 李天瑞 . 基于多通道自注意力机制的电子病历实体关系抽取 [J ] . 计算机学报 , 2020 , 43 ( 5 ): 916 - 929 .
NING S M , TENG F , LI T R . Multi-channel self-attention mechanism for relation extraction in clinical records [J ] . Chinese Journal of Computers , 2020 , 43 ( 5 ): 916 - 929 .
AGRAWAL M , HEGSELMANN S , LANG H , et al . Large language models are few-shot clinical information extractors [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg, PA, USA : ACL , 2022 : 1998 - 2022 .
WADHWA S , AMIR S , WALLACE B . Revisiting relation extraction in the era of large language models [C ] // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Stroudsburg, PA, USA : ACL , 2023 : 15566 - 15589 .
WEI J Q , ZHU W J , ZHU C , et al . AxialRE: axial attention for dialogue relation extraction [C ] // Proceedings of the 2023 International Conference on Intelligent Management and Software Engineering (IMSE) . Piscataway : IEEE Press , 2024 : 25 - 29 .
WANG Y T , PAN Y H , YAN M , et al . A survey on ChatGPT: AI–generated contents, challenges, and solutions [J ] . IEEE Open Journal of the Computer Society , 2023 , 4 : 280 - 302 .
FANG F , HU X G , SHU J H , et al . Text classification model based on multi-head self-attention mechanism and BiGRU [C ] // Proceedings of the 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS) . Piscataway : IEEE Press , 2022 : 357 - 361 .
ZHOU M J , JI D H , LI F . Relation extraction in dialogues: a deep learning model based on the generality and specialty of dialogue text [J ] . IEEE/ACM Transactions on Audio, Speech, and Language Processing , 2021 , 29 : 2015 - 2026 .
ZHAO L L , XU W R , GAO S , et al . Utilizing graph neural networks to improving dialogue-based relation extraction [J ] . Neurocomputing , 2021 , 456 : 299 - 311 .
ZHANG S Z , YE J Y , WANG Q X . Spa-L Transformer: sparse-self attention model of long short-term memory positional encoding based on long text classification [C ] // Proceedings of the 2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD) . Piscataway : IEEE Press , 2023 : 618 - 623 .
GUO M H , WU F , JIANG J L , et al . Investigations on scientific literature meta information extraction using large language models [C ] // Proceedings of the 2023 IEEE International Conference on Knowledge Graph (ICKG) . Piscataway : IEEE Press , 2024 : 249 - 254 .
WANG Z Y , ZHOU Q , ZHAO J F , et al . A knowledge-enhanced medical named entity recognition method that integrates pre-trained language models [C ] // Proceedings of the 2023 IEEE International Conference on Medical Artificial Intelligence (MedAI) . Piscataway : IEEE Press , 2024 : 296 - 301 .
XUE F Z , SUN A X , ZHANG H , et al . An embarrassingly simple model for dialogue relation extraction [C ] // Proceedings of the ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2022 : 6707 - 6711 .
WU G Y , WU W J , LIU X H , et al . Cheap-fake detection with LLM using prompt engineering [C ] // Proceedings of the 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) . Piscataway : IEEE Press , 2023 : 105 - 109 .
NAIMI L , BOUZIANE E M , MANAOUCH M , et al . A new approach for automatic test case generation from use case diagram using LLMs and prompt engineering [C ] // Proceedings of the 2024 International Conference on Circuit, Systems and Communication (ICCSC) . Piscataway : IEEE Press , 2024 : 1 - 5 .
LIU X , JI K X , FU Y C , et al . P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) . Stroudsburg, PA, USA : ACL , 2022 : 61 - 68 .
NORHEIM J J , REBENTISCH E . Structuring natural language requirements with large language models [C ] // Proceedings of the 2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW) . Piscataway : IEEE Press , 2024 : 68 - 71 .
ZHANG C Y , LIU H , ZENG J T , et al . Prompt-enhanced software vulnerability detection using ChatGPT [C ] // Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion) . Piscataway : IEEE Press , 2024 : 276 - 277 .
HU Y , CHEN Q Y , DU J C , et al . Improving large language models for clinical named entity recognition via prompt engineering [J ] . Journal of the American Medical Informatics Association , 2024 , 31 ( 9 ): 1812 - 1820 .
LAN Z Z , CHEN M D , GOODMAN S , et al . ALBERT: a lite BERT for self-supervised learning of language representations [EB ] . arXiv preprint , 2019 , arXiv: 1909.11942 .
赵丹丹 , 黄德根 , 孟佳娜 , 等 . 基于BERT-GRU-ATT模型的中文实体关系分类 [J ] . 计算机科学 , 2022 , 49 ( 6 ): 319 - 325 .
ZHAO D D , HUANG D G , MENG J N , et al . Chinese entity relations classification based on BERT-GRU-ATT [J ] . Computer Science , 2022 , 49 ( 6 ): 319 - 325 .
YANG F , XU S , LI P F , et al . Prompt-based Chinese event temporal relation extraction on LLM predictive information [C ] // Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN) . Piscataway : IEEE Press , 2024 : 1 - 8 .
0
浏览量
65
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621