The problems of NMT Model
- Over-Parameterization
- Long running time
- Overfitting
- Big Storage size
The redundancies of NMT Model
Most important: Higher Layers; Attention and Softmax Weights
redundancy: lower layers; embedding weights;
Traditional Solutions
Optimal Brain Damage (OBD) and Optimal Brain Surgeon(OBS)
Recent Ways
Magnitude based pruning with iterative retraining(基于幅度的剪枝与反复的重复训练)yielded strong results for Convolutional Neural Networks (CNN) performing visual tasks.