Tensorflow model compression, stepping on countless pits, finally done

1. Install bazel, download the .sh file of the linux version from github, and then install it
2. Download the latest TensorFlow source code from GitHub
3. Enter the TensorFlow source code folder and enter commands
bazel build tensorflow/tools/graph_transforms:transform_graph
Here you will encounter various pitfalls, such as
ERROR: / opt/tf/tensorflow-master/tensorflow/core/kernels/BUILD:3044:1: C++ compilation of rule '//tensorflow/core/kernels:matrix_square_root_op' failed (Exit 4) gcc: internal compiler error: Killed (program
cc1plus )
This error is that the cpu load is too large, and a line of code needs to be added

# 生成swap镜像文件
sudo dd if=/dev/zero of=/mnt/512Mb.swap bs=1M count=512
# 对该镜像文件格式化
sudo mkswap /mnt/512Mb.swap
# 挂载该镜像文件 
sudo swapon /mnt/512Mb.swap

Or this @aws Error downloading
I see that some bloggers on csdn have a solution to go to the temporary folder to delete the file and download it again, but I found it useless here. The solution here is to enter a command before running bazel:

sed -i '\@https://github.com/aws/aws-sdk-cpp/archive/1.5.8.tar.gz@aws' tensorflow/workspace.bzl

The URL in the command is the address of the actual file to be downloaded, because some addresses may have been changed

Compiling bazel here is complete

4. After compiling, the model can be compressed, which is also a line of code, in_graph is the input model path, outputs does not move, out_graph is the output model path, and transforms can be filled with quantize_weights. This is to convert 32bit to 8bit, and this is also the case. The most effective step of the method; I see that some bloggers compile the summary first and then print out the input and output nodes, and then input a lot of parameters, and delete some nodes. There is no further reduction in model size, so that's it.

bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=../model/ctpn.pb    --outputs='output_node_name'   --out_graph=../model/quantized_ctpn.pb   --transforms='quantize_weights'

In the end, it was reduced from 68m to 17m, with a reduction ratio of 75%. The actual measurement results are basically the same, and this method is still very useful.

Guess you like

Origin blog.csdn.net/u013837919/article/details/86770669