1.宁夏大学信息工程学院,宁夏 银川 750021
2.宁夏“东数西算”人工智能与信息安全重点实验室,宁夏 银川 750021
邓箴,dengzhen@nxu.edu.cn
收稿:2025-09-19,
修回:2025-11-10,
录用:2025-11-26,
移动端阅览
曹志钢, 刘立波, 刘倩, 等. 超图驱动的三位一体语义框架在零售商品检测中的应用[J/OL]. 物联网学报, 2026.
CAO ZHIGANG, LIU LIBO, LIU QIAN, et al. Application of Hypergraph-Driven "Trinity" Semantic Framework in Retail Commodity Detection[J/OL]. Chinese Journal on Internet of Things, 2026.
零售自助结账系统的运行效率直接影响消费者购物体验。现有系统在提取小商品的细粒度关键特征方面存在困难,同时在严重遮挡场景中空间推理能力不足,限制了检测性能的提升。为解决上述问题,本文在YOLOv12的基础上设计超图驱动的三位一体语义框架。该框架首先通过构建的跨层语义收集模块,结合跨层特征注意力融合子模块(CFAF,Cross-Layer Feature Attention Fusion)与风车卷积(PSConv,Pinwheel-Shaped Convolution),实现跨层级语义互补与深度网络中细粒度特征的高效保留,从而增强对小商品细粒度特征的捕捉能力。其次,引入语义精炼模块,其核心混合动态感知的超图计算(HDP-HGC,Hybrid Dynamic Perception Hypergraph Computation)能有效提升了系统在严重遮挡场景下的识别性能。最后,设计语义映射模块,搭建了非相邻层语义交互通道,三个模块协同构成完整的三位一体语义框架。我们在公开数据集RPC、FinVolution以及真实场景下自建数据集Mine进行了系统评估。实验结果表明,本文算法在平均检测精度上达到了90.94%、92.04%和69.61%,领先目前大多主流模型,特别是在小商品识别和严重遮挡结算场景中,表现出显著优势,在保持较高识别准确率的同时有效控制了模型参数。此外,本文方法将漏检和误检概率降低了约30%和50%,进一步验证了其在复杂零售环境下的稳定性与实用价值。
The efficiency of retail checkout systems directly affects the shopping experience. Current systems have difficulty capturing key details of Small-sized merchandise and lack spatial reasoning in crowded scenes
which limits detection performance. To address the aforementioned issues
this paper develops a hypergraph-driven trinity semantic framework based on YOLOv12. First
the framework uses a cross-layer semantic gathering module. It combines the Cross-Layer Feature Attention Fusion (CFAF) submodule with Pinwheel-Shaped Convolution (PSConv). This process enables complementary semantics across different network layers. It also effectively preserves fine-grained features in the deep network. As a result
the model's ability to capture detailed features of small commodities is significantly improved. Next
a semantic refinement module is introduced. Its core component
Hybrid Dynamic Perception Hypergraph Computation (HDP-HGC)
effectively enhances the system's recognition performance in heavily occluded scenarios. Finally
a semantic mapping module is designed to establish semantic interaction pathways between non-adjacent layers. These three modules work collaboratively to form a complete trinity semantic framework. We conducted systematic evaluations on the public datasets RPC
FinVolution
and our self-built real-world dataset Mine. Experimental results show that our method achieved average detection accuracies of 90.94%
92.04%
and 69.61%
respectively—surpassing most mainstream models. In particular
it demonstrates significant advantages in small-object recognition and heavily occluded checkout scenarios
maintaining high accuracy while effectively controlling model parameters. Furthermore
our approach reduces missed detections and false detections by approximately 30% and 50%
respectively
further verifying its stability and practical value in complex retail environments.
Janiesch C. , Zschech P. , Heinrich K. Machine Learning and Deep Learning [J ] . Electronic Markets , 2021 , 31 ( 3 ): 685 – 695 .
Zou Z. X. , Chen K. Y. , Shi Z. W. , Guo Y. H. , Ye J. P. Object Detection in 20 Years: A Survey[J ] . Proceedings of the IEEE , 2023 , 111 ( 3 ): 257 – 276 .
Wu P. , Zhang Z. , Peng X. , Wang R. Deep Learning Solutions for Smart City Challenges in Urban Development [J ] . Scientific Reports , 2024 , 14 ( 1 ): 5176 .
张静 , 农昌瑞 , 杨智勇 . 基于卷积神经网络的目标检测算法综述 [J ] . 兵器装备工程学报 , 2022 , 43 ( 06 ): 37 - 47 .
夏檑 , 袁海兵 , 吴俊 . 基于改进Faster RCNN的汽车管件密封圈装配检测研究 [J ] . 计算机应用与软件 , 2025 , 42 ( 06 ): 93 - 99 .
王伟辉 , 信泽阳 , 车清论 , 等 . 基于改进Faster RCNN模型的冬枣缺陷检测方法 [J ] . 农业工程学报 , 2024 , 40 ( 22 ): 283 - 289 .
吕晓华 , 魏铭辰 , 刘立波 . 基于位置可学习视觉中心机制的零售商品检测方法 [J ] . 物联网学报 , 2023 , 7 ( 04 ): 142 - 152 .
马超伟 , 张浩 , 马新明 , 等 . 基于改进YOLOv8的轻量化小麦病害检测方法 [J ] . 农业工程学报 , 2024 , 40 ( 05 ): 187 - 195 .
Sun S. , Mo B. , Xu J. , Li D. , Zhao J. , Han S. Multi-YOLOv8: An Infrared Moving Small Object Detection Model Based on YOLOv 8 for Air Vehicle[J ] . Neurocomputing , 2024 , 588 : 127685 .
周昶雯 , 宋强 , 张月 . 基于改进YOLOv8的凝视雷达小目标检测算法 [J ] . 信号处理 , 2025 , 41 ( 05 ): 853 - 866 .
雷帮军 , 余翱 , 吴正平 , 等 . 改进YOLOv8n的无人机航拍小目标检测算法 [J ] . 现代电子技术 , 2025 , 48 ( 03 ): 26 - 34 . DOI: 10.16652/j.issn.1004-373x.2025.03.005 http://dx.doi.org/10.16652/j.issn.1004-373x.2025.03.005 .
Wu S. , Lu X. , Guo C. , Guo H. MV-YOLO: An Efficient Small Object Detection Framework Based on Mamba [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2025 , 63 : 5632814 .
冯庆胜 , 付明雨 , 姚泽圆 , 等 . 基于改进YOLOv8的轨道小尺度异物入侵算法研究 [J ] . 现代电子技术 , 2025 , 48 ( 11 ): 174 - 179 . DOI: 10.16652/j.issn.1004-373x.2025.11.027 http://dx.doi.org/10.16652/j.issn.1004-373x.2025.11.027 .
Ye J. , Zhang Y. , Li P. , Guo Z. , Zeng S. , Wei T. Real-Time Dense Small Object Detection Model for Floating-Litter Detection and Removal on Water Surfaces [J ] . Marine Pollution Bulletin , 2025 , 218 : 118189 .
罗偲 , 李凯扬 , 吴吉花 , 等 . 基于对抗注意力机制的水下遮挡目标检测算法 [J ] . 计算机工程 , 2024 , 50 ( 10 ): 313 - 321 .
Tian Y. , Ye Q. , Doermann D. YOLOv12: Attention-Centric Real-Time Object Detectors . arXiv preprint : arXiv: 2502.12524 , 2025 .
Dao T. , Fu D. , Ermon S. , Rudra A. , Re, C. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness [C ] // Advances in Neural Information Processing Systems . Vancouver, Canada : Curran Associates, Inc , 2022 , 35 : 1 6344– 16359 .
Lin T. Y. , Dollár P. , Girshick R. , He K. , Hariharan B. , Belongie, S. Feature Pyramid Networks for Object Detection [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI, USA : IEEE Computer Society , 2017 : 2117 - 2125 .
Xu S. , Zheng S. , Xu W. , Xu R. , Wang C. , Zhang J. , Guo, L. HEF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection [C ] // 2024 IEEE International Conference on Multimedia and Expo (ICME) . Singapore : IEEE , 2024 : 1 – 6 .
Varghese R. , Sambath , M. YOLOv 8 : A Novel Object Detection Algorithm with Enhanced Performance and Robustness [C ] // 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS) . Chennai, India : IEEE , 2024: 1 - 6 .
Yang J. , Liu S. , Wu J. , Su X. , Hai N. , Huang, X. Pinwheel-Shaped Convolution and Scale-Based Dynamic Loss for Infrared Small Target Detection [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Philadelphia, Pennsylvania, USA : AAAI Press , 2025 , 39 : 9202 – 9210 .
Bai S. , Zhang F. , Torr P. H. Hypergraph Convolution and Hypergraph Attention [J ] . Pattern Recognition , 2021 , 110 : 107637 .
Cai D. , Song M. , Sun C. , Zhang B. , Hong S. , Li, H. Hypergraph Structure Learning for Hypergraph Neural Networks [C ] // Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI-22) . Vienna, Austria : IJCAI Press , 2022 : 1923 – 1929 .
Chien E. , Pan C. , Peng J. , Milenkovic, O. You Are AllSet: A Multiset Function Framework for Hypergraph Neural Networks [C ] // Proceedings of the Tenth International Conference on Learning Representations (ICLR 2022) . Virtual : ICLR Organization , 2022 .
Feng Y. , Huang J. , Du S. , Ying S. , Yong J. H. , Li Y. , Gao Y. Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2025 , 47 ( 4 ): 2388 - 2401 .
Wei X. S. , Cui Q. , Yang L. , Wang P. , Liu L. Q. , Yang J. RPC: A Large-Scale and Fine-Grained Retail Product Checkout Dataset [J ] . Science China Information Sciences , 2022 , 65 ( 9 ): 1 – 2 .
Carion N. , Massa F. , Synnaeve G. , Usunier N. , Kirillov A. , Zagoruyko, S. End-to-End Object Detection with Transformers [C ] // Proceedings of the European Conference on Computer Vision (ECCV) . Virtual : Springer , 2020 : 213 – 229 .
Huang S. , Lu Z. , Cun X. , Yu Y. , Zhou X. , Shen, X. DEIM: DETR with Improved Matching for Fast Convergence [C ] // Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) . Nashville, Tennessee, USA : IEEE , 2025 : 15162 – 15171 .
Ding X. , Zhang X. , Han J. , Ding G. Scaling Up Your Kernels to 31 × 31 : Revisiting Large Kernel Design in CNNs [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . New Orleans, LA, USA : IEEE Computer Society , 2022: 11963 – 11975 .
Narkhede J. ( 2024 ). Comparative evaluation of post-hoc explainability methods in AI: LIME, SHAP, and Grad-CAM [C ] // 2024 4th International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES) , Guilin, China : IEEE , 2024 : 826 - 830 .
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621