浏览全部资源
扫码关注微信
1.北京交通大学电子信息工程学院,北京 100044
2.东南大学信息科学与工程学院,江苏 南京 211189
3.中国科学院计算技术研究所,北京 100190
4.海南大学信息与通信工程学院,海南 海口 570228
[ "赵军辉(1973‒ ),男,博士,北京交通大学电子信息工程学院教授、博士生导师,主要研究方向为无线与移动通信及相关应用、5G移动通信技术、高速铁路通信、车载通信网络、无线定位和认知无线电。" ]
[ "李怀城(1998‒ ),男,北京交通大学电子信息工程学院博士生,主要研究方向为绿色通信、模型压缩、图像处理。" ]
[ "王东明(1977‒ ),男,博士,东南大学信息科学与工程学院教授、博士生导师,主要研究方向为无蜂窝大规模分布式MIMO基础理论与技术、6G无线传输关键技术研究、毫米波分布式MIMO理论与技术、5G物理层协议栈。" ]
[ "李佳珉(1983‒ ),男,博士,东南大学信息科学与工程学院教授、博士生导师,主要研究方向为6G无蜂窝智能无线接入网、海量终端高可靠低时延通信、通感一体化、6G极致连接、未来移动通信综合试验平台的研发及试验验证。" ]
[ "周一青(1975‒ ),女,博士,中国科学院计算技术研究所无线通信技术研究中心研究员副主任、博士生导师,主要研究方向为宽带无线通信技术、通信与计算融合、异构网络、协同传输、绿色无线电等。" ]
[ "束锋(1973‒ ),男,博士,海南大学信息与通信工程学院教授、博士生导师,主要研究方向为宽智能无线通信、信息安全、大规模MIMO测向与定位。" ]
纸质出版日期:2024-12-10,
收稿日期:2024-10-15,
修回日期:2024-12-10,
移动端阅览
赵军辉, 李怀城, 王东明, 等. 物联网中模型剪枝技术:现状、方法和展望[J]. 物联网学报, 2024,8(4):1-13.
ZHAO JUNHUI, LI HUAICHENG, WANG DONGMING, et al. Model pruning techniques in the Internet of things: state of the art, methods and perspectives. [J]. Chinese journal on internet of things, 2024, 8(4): 1-13.
赵军辉, 李怀城, 王东明, 等. 物联网中模型剪枝技术:现状、方法和展望[J]. 物联网学报, 2024,8(4):1-13. DOI: 10.11959/j.issn.2096-3750.2024.00448.
ZHAO JUNHUI, LI HUAICHENG, WANG DONGMING, et al. Model pruning techniques in the Internet of things: state of the art, methods and perspectives. [J]. Chinese journal on internet of things, 2024, 8(4): 1-13. DOI: 10.11959/j.issn.2096-3750.2024.00448.
在物联网(IoT
Internet of things)技术迅速发展的背景下,IoT设备受到计算能力、存储空间、通信带宽以及电池寿命的限制,在运行复杂的人工智能(AI
artificial intelligence)算法中,特别是深度学习模型中面临着挑战。模型剪枝技术通过减少神经网络中的冗余参数,在不损伤AI模型性能的前提下可以有效地降低计算和存储需求。该技术适合用于优化部署在物联网设备上的AI模型。首先,回顾了当前流行的结构化剪枝和非结构化剪枝两种典型的模型剪枝技术,两种剪枝技术分别适用于不同的应用场景。之后,详细分析了这些方法在IoT环境下的多样化应用。最后,结合最新研究成果,详细探讨了当前模型剪枝的局限性,并对物联网中模型剪枝方法未来的发展方向进行了展望。
In the context of the rapid development of Internet of things (IoT) technology
IoT devices faced challenges in running complex artificial intelligence (AI) algorithms
especially deep learning models
due to the limitations of computing power
storage space
communication bandwidth
and battery life. Model pruning technology could effectively reduce computation and storage requirements by reducing redundant parameters in neural networks without impairing the performance of AI models. This technique was extremely suitable for optimising AI models deployed on IoT devices. Firstly
two typical model pruning techniques-structured pruning and unstructured pruning
which were currently popular and suitable for different application scenarios
were reviewed. Secondly
the diverse applications of these methods in IoT environments were analysed in detail. Finally
the limitations of the current model pruning were discussed in detail in the light of the latest research results
and the future development direction of model pruning methods in IoT was outlooked.
物联网资源限制模型剪枝人工智能深度学习
IoTresource constraintsmodel pruningAIdeep learning
张琦, 杨浩, QUEK T, 等. 物联网的核心本质: 数据联网[J]. 物联网学报, 2017, 1(3): 10-16.
ZHANG Q, YANG H, QUEK T, et al. Kernel of Internet of things: Internet of data[J]. Chinese Journal on Internet of Things, 2017, 1(3): 10-16.
ALZOUBI A. Machine learning for intelligent energy consumption in smart homes[J]. International Journal of Computations, Information and Manufacturing (IJCIM), 2022, 2(1): 62-75.
HERATH H M K K M B, MITTAL M. Adoption of artificial intelligence in smart cities: a comprehensive review[J]. International Journal of Information Management Data Insights, 2022, 2(1): 100076.
RIBEIRO J, RUI L M, ECKHARDT T, et al. Robotic process automation and artificial intelligence in industry 4.0-a literature review[J]. Procedia Computer Science, 2021, 181: 51-58.
中国信息通信研究院. 中国算力发展指数白皮书 [R]. 2023.
CAICT. China arithmetic development index white paper[R]. 2023.
殷浩然, 苗世洪, 韩佶, 等. 基于三维卷积神经网络的配电物联网异常辨识方法[J]. 电力系统自动化, 2022, 46(1): 42-50.
YIN H R, MIAO S H, HAN J, et al. Anomaly identification method for distribution Internet of things based on three-dimensional convolutional neural network[J]. Automation of Electric Power Systems, 2022, 46(1): 42-50.
CHANG Y P, WANG X, WANG J D, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45.
BROWN T B. Language models are few-shot learners[J]. arXiv preprint, 2020, arXiv:14165.
CHEN S Z, TAO Y M, YU D X, et al. Privacy-preserving collaborative learning for multiarmed bandits in IoT[J]. IEEE Internet of Things Journal, 2021, 8(5): 3276-3286.
TIAN H S, YU M C, WANG W. Continuum: a platform for cost-aware, low-latency continual learning[C]//Proceedings of the ACM Symposium on Cloud Computing. New York: ACM, 2018: 26-40.
ZHU M, GUPTA S. To prune, or not to prune: exploring the efficacy of pruning for model compression[J]. arXiv preprint, 2017, arXiv:01878.
ARDAKANI A, JI Z Y, SMITHSON S C, et al. Learning recurrent binary/ternary weights[J]. arXiv preprint, 2018, arXiv: 1809.11086.
YANG T J, CHEN Y H, SZE V. Designing energy-efficient convolutional neural networks using energy-aware pruning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6071-6079.
LI H, KADAV A, DURDANOVIC I, et al. Pruning filters for efficient ConvNets[J]. arXiv preprint, 2016, arXiv: 1608.08710.
MOHANTY L, KUMAR A, MEHTA V, et al. Pruning techniques for artificial intelligence networks: a deeper look at their engineering design and bias: the first review of its kind[J]. Multimedia Tools and Applications, 2024: 1-75.
LI Q P, ZHAO J H, GONG Y, et al. Energy-efficient computation offloading and resource allocation in fog computing for Internet of everything[J]. China Communications, 2019, 16(3): 32-41.
LI Z, MENG L. A survey of model pruning for deep neural network[C]//Proceedings of the International Symposium on Advanced Technologies and Applications in the Internet of Things, 2022: 25-34.
HAN S, MAO H Z, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding[J]. arXiv preprint, 2015, arXiv: 1510.00149.
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788.
MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(7): 3523-3542.
XIAO G, LIN J, SEZNEC M, et al. Smoothquant: accurate and efficient post-training quantization for large language models[C]// Proceedings of the 40th International Conference on Machine Learning. New York: ACM, 2023, 202: 38087-38099.
GUO Y Y, WANG G Z, KANKANHALLI M. PELA: learning parameter-efficient models with low-rank approximation[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 15699-15709.
BEYER L, ZHAI X H, ROYER A, et al. Knowledge distillation: a good teacher is patient and consistent[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 10915-10924.
QI C, SHEN S B, LI R P, et al. An efficient pruning scheme of deep neural networks for Internet of things applications[J]. EURASIP Journal on Advances in Signal Processing, 2021, 2021(1): 31.
HE Y, XIAO L G. Structured pruning for deep convolutional neural networks: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(5): 2900-2919.
HE Y, LIU P, WANG Z W, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 4335-4344.
GAO X T, ZHAO Y R, DUDZIAK Ł, et al. Dynamic channel pruning: feature boosting and suppression[J]. arXiv preprint, 2018, arXiv: 05331.
LIN M B, JI R R, WANG Y, et al. HRank: filter pruning using high-rank feature map[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 1526-1535.
HU H. Network trimming: a data-driven neuron pruning approach towards efficient deep architectures[J]. arXiv preprint, 2016, arXiv: 03250.
LIU Z, LI J G, SHEN Z Q, et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2755-2763.
Zhuang T, ZHANG Z X, HUANG Y H, et al. Neuron-level structured pruning using polarization regularizer[J]. Advances in neural information processing systems, 2020, 33: 9865-9877.
TORSTEN H, DAN A, TAL B N, et al. Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks[J]. Journal of Machine Learning Research, 2021, 22(241): 1-124.
HE Y H, LIN J, LIU Z J, et al. AMC: AutoML for model compression and acceleration on mobile devices[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018: 815-832.
LIN M, JI R, ZHANG Y, et al. Channel pruning via automatic structure search[J]. arXiv preprint, 2020, arXiv: 200108565.
ALWANI M, WANG Y, MADHAVAN V. DECORE: deep compression with reinforcement learning[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 12339-12349.
LI Y, GU S, ZHANG K, et al. DHP: differentiable meta pruning via hypernetworks[C]//Proceedings of the Computer Vision-ECCV 2020: 16th European Conference. New York: ACM, 2020, 608-624.
NING X F, ZHAO T C, LI W S, et al. DSA: more efficient budgeted pruning via differentiable sparsity allocation[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020: 592-607.
LIU Z C, MU H Y, ZHANG X Y, et al. MetaPruning: meta learning for automatic neural network channel pruning[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 3295-3304.
LIN T, STICH S U, BARBA L, et al. Dynamic model pruning with feedback[J]. arXiv preprint, 2020, arXiv: 2006.07253.
HE Y, KANG G, DONG X, et al. Soft filter pruning for accelerating deep convolutional neural networks[J]. arXiv preprint, 2018, arXiv: 06866.
YVINEC E, DAPOGNY A, CORD M, et al. Red: looking for redundancies for data-free structured compression of deep neural networks[J]. arXiv preprint, 2021, arXiv: 2105.14797.
NIE Y W, ZHAO J H, LIU J, et al. Energy-efficient UAV trajectory design for backscatter communication: a deep reinforcement learning approach[J]. China Communications, 2020, 17(10): 129-141.
NIE Y W, ZHAO J H, GAO F F, et al. Semi-distributed resource management in UAV-aided MEC systems: a multi-agent federated reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2021, 70(12): 13162-13173.
ZHAO J H, HE L, ZHANG D Y, et al. A TP-DDPG algorithm based on cache assistance for task offloading in urban rail transit[J]. IEEE Transactions on Vehicular Technology, 2023, 72(8): 10671-10681.
GUO Y, YAO A, CHEN Y. Dynamic network surgery for efficient dnns[J]. arXiv preprint, 2023, arXiv: 1608.04493.
HAN S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks[J]. Advances in Neural Information Processing Systems, 2015: 1135-1143.
FRANTAR E, ALISTARH D. SparseGPT: massive language models can be accurately pruned in one-shot[J]. arXiv preprint, 2023, arXiv: 2301.00774.
SUN M, LIU Z, BAIR A, et al. A simple and effective pruning approach for large language models[J]. arXiv preprint, 2023, arXiv:11695.
HAN L X, XIAO Z, LI Z J. DTMM: deploying TinyML models on extremely weak IoT devices with pruning[C]//Proceedings of the IEEE Conference on Computer Communications. Piscataway: IEEE Press, 2024: 1999-2008.
WIDMANN T, MERKLE F, NOCKER M, et al. Pruning for power: optimizing energy efficiency in IoT with neural network pruning[M]//Communications in Computer and Information Science. Cham: Springer Nature Switzerland, 2023: 251-263.
LIBERIS E, LANE N D. Differentiable neural network pruning to enable smart applications on microcontrollers[C]//Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. New York: ACM, 2022, 6(4): 1-19.
MOHAIMENUZZAMAN M, BERGMEIR C, MEYER B. Pruning vs XNOR-Net: a comprehensive study of deep learning for audio classification on edge-devices[J]. IEEE Access, 2022, 10: 6696-6707.
RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 525-542.
SHI J Y, ZHAO J H, WANG D M, et al. Lane detection by variational auto-encoder with normalizing flow for autonomous driving[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(12): 21757-21768.
YANG H Y, NI J G, GAO J Y, et al. A novel method for peanut variety identification and classification by improved VGG16[J]. Scientific Reports, 2021, 11(1): 15756.
SHAFIEE M J, CHYWL B, LI F, et al. Fast YOLO: a fast you only look once system for real-time embedded object detection in video[J]. arXiv preprint, 2017, arXiv:05943.
ZHANG Q, YANG Y-B. Rest: an efficient transformer for visual recognition[J]. Advances in Neural Information Processing Systems, 2021, 34: 15475-85.
KRIZHEVSKY A. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 1-60.
DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway: IEEE Press, 2009, 248-255.
EVERINGHAM M, ALI ESLAMI S M, VAN GOOL L, et al. The pascal visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136.
Junior F F, NONATO L G, RANIERI C M, et al. Memory-based pruning of deep neural networks for IoT devices applied to flood detection[J]. Sensors, 2021, 21(22): 7506.
ZHAO M, TONG X D, WU W X, et al. A novel deep-learning model compression based on filter-stripe group pruning and its IoT application[J]. Sensors, 2022, 22(15): 5623.
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778.
蒋文翔, 张传臣. 一种基于剪枝YOLO的传感器识别方法[J]. 中国科技信息, 2019(22): 71-73.
JIANG W X, ZHANG C C. A sensor identification method based on pruning YOLO[J]. China Science and Technology Information, 2019(22): 71-73.
LIU X G, WU L S, DAI C, et al. Compressing CNNs using multilevel filter pruning for the edge nodes of multimedia Internet of things[J]. IEEE Internet of Things Journal, 2021, 8(14): 11041-11051.
WANG Z D, LIU X X, HUANG L, et al. QSFM: model pruning based on quantified similarity between feature maps for AI on edge[J]. IEEE Internet of Things Journal, 2022, 9(23): 24506-24515.
SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4510-4520.
ZHAO M, HU M, LI M, et al. A novel fusion pruning algorithm based on information entropy stratification and IoT application[J]. Electronics, 2022, 11(8): 1212.
DAI C, LIU X G, CHENG H Q, et al. Compressing deep model with pruning and tucker decomposition for smart embedded systems[J]. IEEE Internet of Things Journal, 2022, 9(16): 14490-14500.
ALOM M Z, TAHA T, YAKOPCIC C, et al. The history began from AlexNet: a comprehensive survey on deep learning approaches[J]. arXiv preprint, 2018, arXiv: 01164.
WU R T, SINGLA A, JAHANSHAHI M R, et al. Pruning deep convolutional neural networks for efficient edge computing in condition assessment of infrastructures[J]. Computer-Aided Civil and Infrastructure Engineering, 2019, 34(9): 774-789.
XU Y, XIAO M J, WU J, et al. Enhancing decentralized federated learning with model pruning and adaptive communication[J]. IEEE Transactions on Industrial Informatics, 2024, PP(99): 1-15.
XU W Y, FANG W W, DING Y, et al. Accelerating federated learning for IoT in big data analytics with pruning, quantization and selective updating[J]. IEEE Access, 2021, 9: 38457-38466.
DU M X, ZHENG H F, GAO M, et al. Adaptive decentralized federated learning in resource-constrained IoT networks[J]. IEEE Internet of Things Journal, 2024, 11(6): 10739-10753.
PRAKASH P, DING J H, CHEN R, et al. IoT device friendly and communication-efficient federated learning via joint model pruning and quantization[J]. IEEE Internet of Things Journal, 2022, 9(15): 13638-13650.
SHEN M, MENG J, XU K, et al. MemDefense: defending against membership inference attacks in IoT-based federated learning via pruning perturbations[J]. IEEE Transactions on Big Data, 2024, PP(99): 1-13.
0
浏览量
2073
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构