Deep learning has achieved impressive results in image classification, computer vision and natural language processing. To achieve better performance, deeper and wider networks have been designed, which increase the demand for computational resources. The number of floating-point operations (FLOPs)
Neural Networks with Model Compression
โ Scribed by Baochang Zhang, Tiancheng Wang, Sheng Xu, David Doermann
- Publisher
- Springer
- Year
- 2024
- Tongue
- English
- Leaves
- 267
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Deep learning has achieved impressive results in image classification, computer vision and natural language processing. To achieve better performance, deeper and wider networks have been designed, which increase the demand for computational resources. The number of floating-point operations (FLOPs) has increased dramatically with larger networks, and this has become an obstacle for convolutional neural networks (CNNs) being developed for mobile and embedded devices. In this context, our book will focus on CNN compression and acceleration, which are important for the research community. We will describe numerous methods, including parameter quantization, network pruning, low-rank decomposition and knowledge distillation. More recently, to reduce the burden of handcrafted architecture design, neural architecture search (NAS) has been used to automatically build neural networks by searching over a vast architecture space. Our book will also introduce NAS due to its superiority and state-of-the-art performance in various applications, such as image classification and object detection. We also describe extensive applications of compressed deep models on image classification, speech recognition, object detection and tracking. These topics can help researchers better understand the usefulness and the potential of network compression on practical applications. Moreover, interested readers should have basic knowledge about machine learning and deep learning to better understand the methods described in this book.
โฆ Table of Contents
Preface
Contents
1 Introduction
1.1 Background
1.2 Introduction of Deep Learning
1.3 Model Compression and Acceleration
References
2 Binary Neural Networks
2.1 Introduction
2.2 Gradient Approximation
2.3 Quantization
2.4 Structural Design
2.5 Loss Design
2.6 Optimization
2.7 Algorithms for Binary Neural Networks
2.7.1 BNN: Binary Neural Network
2.7.2 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
2.7.3 SA-BNN: State-Aware Binary Neural Network
2.7.3.1 Method
2.7.3.2 Experiments
2.7.4 PCNN: Projection Convolutional Neural Networks
2.7.4.1 Projection
2.7.4.2 Optimization
2.7.4.3 Theoretical Analysis
2.7.4.4 Projection Convolutional Neural Networks
2.7.4.5 Forward Propagation Based on Projection Convolution Layer
2.7.4.6 Backward Propagation
2.7.4.7 Progressive Optimization
2.7.4.8 Ablation Study
References
3 Binary Neural Architecture Search
3.1 Introduction
3.2 Neural Architecture Search
3.2.1 ABanditNAS: Anti-bandit for Neural Architecture Search
3.2.1.1 Anti-Bandit Algorithm
3.2.1.2 Search Space
3.2.1.3 Anti-bandit Strategy for NAS
3.2.1.4 Adversarial Optimization
3.2.1.5 Analysis
3.2.2 IDARTS: Interactive Differentiable Architecture Search
3.2.2.1 Bilinear Models for DARTS
3.2.2.2 Search Space
3.2.2.3 Backtracking Back Propagation
3.2.2.4 Comparison of Searching Methods
3.2.3 Fast and Unsupervised Neural Architecture Evolution for Visual Representation Learning
3.2.3.1 Search Space
3.2.3.2 Evolution
3.2.3.3 Contrastive Learning
3.2.3.4 Fast Evolution by Eliminating Operations
3.2.3.5 Experiments
3.3 Binary Neural Architecture Search
3.3.1 BNAS: Binarized Neural Architecture Search for Efficient Object Recognition
3.3.1.1 Search Space
3.3.1.2 Binarized Optimization for BNAS
3.3.1.3 Performance-Based Strategy for BNAS
3.3.1.4 Gradient Update for BNAS
3.3.1.5 Ablation Study
3.3.2 BDetNAS: A Fast Binarized Detection Neural Architecture Search
3.3.2.1 Search Space
3.3.2.2 Performance-Based Strategy for BDetNAS
3.3.2.3 Optimization for BDetNAS
3.3.2.4 Experiments
References
4 Quantization of Neural Networks
4.1 Introduction
4.2 Quantitative Arithmetic Principles
4.3 Uniform and Nonuniform Quantization
4.4 Symmetric and Asymmetric Quantization
4.5 Comparison of Different Quantization Methods
4.5.1 LSQ: Learned Step Size Quantization
4.5.1.1 Notations
4.5.1.2 Step Size Gradient
4.5.1.3 Step Size Gradient Scale
4.5.1.4 Training
4.5.2 TRQ: Ternary Neural Networks with Residual Quantization
4.5.2.1 Preliminary
4.5.2.2 Generalization to n-Bit Quantization
4.5.2.3 Complexity Analysis
4.5.2.4 Differences of TRQ from Existing Residual Quantization Methods
4.5.2.5 Implementation Details
4.5.2.6 Ablation Study on CIFAR
4.5.3 OMPQ: Orthogonal Mixed Precision Quantization
4.5.3.1 Network Orthogonality
4.5.3.2 Efficient Orthogonality Metric
4.5.3.3 Mixed Precision Quantization
4.5.3.4 Experiment
4.5.3.5 Ablation Study
References
5 Network Pruning
5.1 Introduction
5.2 Structured Pruning
5.3 Unstructured Pruning
5.4 Network Pruning
5.4.1 Efficient Structured Pruning Based on Deep Feature Stabilization
5.4.1.1 Preliminaries
5.4.1.2 Sparse Supervision for Block Pruning
5.4.1.3 Constrained Sparse Supervision for Filter Pruning
5.4.1.4 Loss Function
5.4.1.5 Optimization
5.4.1.6 Pruning on ResNet
5.4.1.7 Experiments
5.4.1.8 Ablation Study
5.4.2 Toward Compact and Sparse CNNs via Expectation-Maximization
5.4.2.1 Preliminaries
5.4.2.2 Distribution-Aware Forward and Loss Function
5.4.2.3 Optimization and Analysis
5.4.2.4 Filter Modification
5.4.2.5 Experiments
5.4.2.6 Efficiency Analysis
5.4.3 Pruning Multi-view Stereo Net for Efficient 3D Reconstruction
5.4.3.1 Channel Pruning for 2D CNNs
5.4.3.2 Optimization Based on a Mixed Back Propagation
5.4.3.3 3D CNN Pruning
5.4.3.4 Loss Function
5.4.3.5 Implementation of 2D/3D MVS Net
5.4.3.6 Performance Comparison
5.4.4 Cogradient Descent for Dependable Learning
5.4.4.1 Gradient Descent
5.4.4.2 Cogradient Descent for Dependable Learning
5.4.4.3 Applications
5.4.4.4 Network Pruning
5.4.4.5 Experiments
5.4.4.6 Ablation Study
5.5 Network Pruning on BNNs
5.5.1 Rectified Binary Convolutional Networks with Generative Adversarial Learning
5.5.1.1 Loss Function
5.5.1.2 Learning RBCNs
5.5.1.3 Network Pruning
5.5.1.4 Learning Pruned RBCNs
5.5.1.5 Ablation Study
5.5.2 BONN: Bayesian Optimized Binary Neural Network
5.5.2.1 Bayesian Formulation for Compact 1-Bit CNNs
5.5.2.2 Bayesian Learning Losses
5.5.2.3 Bayesian Pruning
5.5.2.4 BONNs
5.5.2.5 Forward Propagation
5.5.2.6 Asynchronous Backward Propagation
5.5.2.7 Ablation Study
References
6 Applications
6.1 Introduction
6.2 Image Classification
6.3 Speech Recognition
6.3.1 1-Bit WaveNet: Compression of a Generative Neural Network in Speech Recognition with Two Binarized Methods
6.3.1.1 Network Architecture
6.3.1.2 Bi-Real Net Binarization
6.3.1.3 Projection Convolutional Neural Network Binarization
6.4 Object Detection and Tracking
6.4.1 Data-Adaptive Binary Neural Networks for Efficient Object Detection and Recognition
6.4.1.1 Data-Adaptive Amplitude Method
6.4.1.2 Data-Adaptive Channel Amplitude
6.4.1.3 Data-Adaptive Spatial Amplitude
6.4.1.4 Experiment on Object Recognition
6.4.1.5 Ablation Study on Object Recognition
6.4.1.6 Network Accuracy Comparison on ImageNet
6.4.1.7 Experiment on Object Detection
6.4.1.8 Performance Comparison on PASCAL VOC
6.4.1.9 Computation and Storage Analysis
6.4.2 Amplitude Suppression and Direction Activation in Networks for Faster Object Detection
6.4.2.1 Methodology
6.4.2.2 Back Propagation
6.4.2.3 Amplitude Calculation and Suppression
6.4.2.4 Experiments
6.4.2.5 Ablation Study
6.4.2.6 Object Detection
6.4.2.7 Image Classification
6.4.3 Q-YOLO: Efficient Inference for Real-Time Object Detection
6.4.3.1 Preliminaries
6.4.3.2 Uniform Quantization
6.4.3.3 Quantization Range Setting
6.4.3.4 Unilateral Histogram (UH)-Based Activation Quantization
6.4.3.5 Experiments
6.4.3.6 Ablation Study
6.4.3.7 Quantization Type
6.4.3.8 Inference Speed
References
๐ SIMILAR VOLUMES
Studies of the evolution of animal signals and sensory behaviour have more recently shifted from considering 'extrinsic' (environmental) determinants to 'intrinsic' (physiological) ones. The drive behind this change has been the increasing availability of neural network models. With contributions fr
Studies of the evolution of animal signals and sensory behaviour have more recently shifted from considering 'extrinsic' (environmental) determinants to 'intrinsic' (physiological) ones. The drive behind this change has been the increasing availability of neural network models. With contributions fr
Studies of the evolution of animal signals and sensory behaviour have more recently shifted from considering 'extrinsic' (environmental) determinants to 'intrinsic' (physiological) ones. The drive behind this change has been the increasing availability of neural network models. With contributions fr
Research in neural networks has escalated dramatically in the last decade, acquiring along the way terms and concepts, such as learning, memory, perception, recognition, which are the basis of neuropsychology. Nevertheless, for many, neural modelling remains controversial in its purported ability to