使用pytorch自带的量化方法。
动态量化方法无法量化卷积层,代码不会报错,只是不会对做量化操作
quantized_model = torch.quantization.quantize_dynamic(model, {nn.Conv2d}, dtype=torch.qint8)
只能量化以下层
if dtype == torch.qint8:
qconfig_spec = {
nn.Linear : default_dynamic_qconfig,
nn.LSTM : default_dynamic_qconfig,
nn.GRU : default_dynamic_qconfig,
nn.LSTMCell : default_dynamic_qconfig,
nn.RNNCell : default_dynamic_qconfig,
nn.GRUCell : default_dynamic_qconfig,
}
采用静态量化方法到转化模型这一步
model_int8 = torch.quantization.convert(model_fused_prepared)
由于深度可分离卷积模型定义如下
def conv_dw(inp, oup, stride):
return nn.Sequential(
nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=True),
nn.ReLU(inplace=True),
nn.Conv2d(inp, oup, 1, 1, 0, bias=True),
nn.ReLU(inplace=True),
)
报错,深度卷积层的groups为3,所以这样也不行。
RuntimeError: Quantized cudnn conv2d is currenty limited to groups = 1; received groups =3