Model Compression and Acceleration Based on Progressive Random Pruning Algorithm

Haibo Ge; Ting Zhou; Chaofeng Huang; Qiang Li; Xing Song; Shuxian Wang; Shixiong Ma; Mengyang Cheng

doi:10.1109/ICNLP55136.2022.00089

Abstract

The deep neural network model usually has a large number of weight parameters. In order to reduce its occupation of storage space, a large number of weights pruning algorithms haw been proposed successively. Given the above problems, we found that the traditional weight pruning algorithm needs a lot of computation to evaluate the importance of parameters before pruning. This paper omits the step of evaluating the importance of parameters, proposes random pruning, and proves that the model after random pruning can still have a good performance after fine- tuning. In addition, a sparsity function is introduced in the pruning process, and the model sparsity is used to guide the pruning to enhance the stability of the network. Finally, the low-rank decomposition algorithm is combined to further realize the acceleration of the network. This paper uses the CFAR-10 dataset to compare the proposed algorithm with similar algorithms on ResNet-50 and VGG-16 networks, and it is superior to similar algorithms in terms of compression rate, FLOP and stability. Compared with the benchmark network, the algorithm in this paper achieves 1.99 x and 2. 8x the compression rate, 1.87 x and 1.83 x the acceleration effect on the two networks, respectively.

Model Compression and Acceleration Based on Progressive Random Pruning Algorithm

Authors

Abstract

Related Articles