深度学习初探

深度学习初探

一、准备电脑

入手了ASUS的飞行堡垒, 基本配置为 CPU:I5 7代标压,GPU:GTX1060 6G,内存16G,SSD 128G。
首先安装Ubuntu 16.04.1,分配了40G的空间给home。

1、挂起之后无法唤醒: laptop_mode

安装完后发现笔记本在关闭了盒盖后,无法唤醒。上网查资料说要启动laptop_mode。

访问配置文件:

sudo lmt-config-gui

wr@wr-FX503VM:~$ dpkg -l | grep laptop-mode-tools
ii  laptop-mode-tools 1.68-3ubuntu1 all Tools for Power Savings based on battery/AC status

in /etc/default/acpi-support,we will see:
# Note: to enable "laptop mode" (to spin down your hard drive for longer
# periods of time), install the laptop-mode-tools package and configure
# it in /etc/laptop-mode/laptop-mode.conf.

提示我们在/etc/laptop-mode/laptop-mode.conf 中进行配置
找到文件查找 ENABLE_LAPTOP_MODE_ON_BATTERY、ENABLE_LAPTOP_MODE_ON_AC、ENABLE_LAPTOP_MODE_WHEN_LID_CLOSED
看注释大体明白什么意思 当用电池,外接电源,合上显示屏的时候是否启用 LAPTOP_MODE
全部设置为 1 就可以了。

sudo laptop_mode start  

ok

check:
wr@wr-FX503VM:~$ cat /proc/sys/vm/laptop_mode 
2
sudo gedit /etc/systemd/logind.conf 
#HandleLidSwitch=suspend -> HandleLidSwitch=ignore
#HandleLidSwitchDocked=ignore -> HandleLidSwitchDocked=ignore

sudo restart systemd-logind

还是不完全正常,后分析应该是要装nvidia的驱动。

2、安装NVIDIA驱动

一开始搜资料走了个弯路,

sudo gedit /etc/modprobe.d/blacklist.conf
将blacklist nouveau追加到/etc/modprobe.d/blacklist.conf

{
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist rivatv
blacklist nvidiafb
}

sudo update-initramfs -u

然后启动不起来了,血的教训告诉我们,不要这样干~

通过安装关盘进入命令行,改回来~

然后

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

ubuntu-drivers devices:
{
wr@wr-FX503VM:~$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001C20sv00001043sd0000154Ebc03sc00i00
vendor   : NVIDIA Corporation
driver   : nvidia-396 - third-party free recommended
driver   : nvidia-390 - third-party free
driver   : nvidia-384 - third-party free
driver   : xserver-xorg-video-nouveau - distro free builtin
}

找到合适的驱动,

按ctrl+alt+F1进入tty文本模式
关闭(图形)桌面显示管理器LightDM

sudo service lightdm stop
sudo apt-get install nvidia-396
sudo reboot
sudo nvidia-smi
sudo nvidia-settings

这时候特地去看了一下
lsmod | grep nouveau
nouveau不存在~

3、安装nvidia-cuda9.1

上官网找到合适自己机器的版本,cuda_9.1.85_387.26_linux

  • 安装cuda
sudo ./cuda_9.1.85_387.26_linux.run

注意:执行后会有一些提示让你确认,在第二个提示的地方,有个让你选择是否安装驱动时(Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?),一定要选择否:因为前面我们已经安装了更加新的nvidia驱动,所以这里不要选择安装。
其余的都直接默认或者选择是即可。然后安装的时候会比你预想的要快很多。。。

  • 配置bashrc
sudo gedit ~/.bashrc
export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}} 
export LD_LIBRARY_PATH=/usr/local/cuda9.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} 
export CUDA_HOME=/usr/local/cuda
  • 配置profile
sudo gedit /etc/profile
export PATH=/usr/local/cuda/bin:$PATH

保存之后,创建链接文件:

sudo gedit /etc/ld.so.conf.d/cuda.conf

在打开的文件中添加如下语句:

    /usr/local/cuda/lib64

然后执行

    sudo ldconfig

重启ok

跑一下demo验证:

wr@wr-FX503VM:/usr/local/cuda-9.1/samples/1_Utilities/deviceQuery$ sudo make
[sudo] password for wr: 
"/usr/local/cuda-9.1"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery.o -c deviceQuery.cpp
"/usr/local/cuda-9.1"/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery deviceQuery.o 
mkdir -p ../../bin/x86_64/linux/release
cp deviceQuery ../../bin/x86_64/linux/release
wr@wr-FX503VM:/usr/local/cuda-9.1/samples/1_Utilities/deviceQuery$ 
wr@wr-FX503VM:/usr/local/cuda-9.1/samples/1_Utilities/deviceQuery$ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1060"
  CUDA Driver Version / Runtime Version          9.2 / 9.1
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 6070 MBytes (6365118464 bytes)
  (10) Multiprocessors, (128) CUDA Cores/MP:     1280 CUDA Cores
  GPU Max Clock rate:                            1671 MHz (1.67 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.1, NumDevs = 1
Result = PASS

4、安装nvidia-cudnn

官网下载相应的cudnn

下载完cudnn之后进行解压,cd进入cudnn解压之后的include目录,在命令行进行如下操作:

    sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件

再将cd进入lib64目录下的动态文件进行复制和链接:

    sudo cp lib* /usr/local/cuda/lib64/ #复制动态链接库
    cd /usr/local/cuda/lib64/
    sudo rm -rf libcudnn.so libcudnn.so.7 #删除原有动态文件
    sudo ln -s libcudnn.so.7.1.3 libcudnn.so.7 #生成软衔接
    sudo ln -s libcudnn.so.7 libcudnn.so #生成软链接

二、准备Android环境

1、安装gradle-4.1

过程略。

2、安装Android Stadio

过程略。

3、安装Sdk

过程略。

4、安装Ndk

过程略。

5、安装Adb

sudo  apt-get  install android-tools-adb

adb about:
https://blog.csdn.net/u012351661/article/details/78201040

三、搭建TensorFlow环境

1、安装Anaconda2-5.1.0-Linux-x86_64,在Anaconda里面跑TensorFlow

python2.7,过程略。

2、安装bazel

略。

3、配置TensorFlow

git clone --recurse-submodules https://github.com/tensorflow/tensorflow

configure:
{
wr@wr-FX503VM:~/Tensorflow_Workspace/tensorflow$ ./configure 
You have bazel 0.12.0 installed.
Please specify the location of python. [Default is /home/wr/anaconda2/bin/python]: 


Found possible Python library paths:
  /home/wr/anaconda2/lib/python2.7/site-packages
Please input the desired Python library path to use.  Default is [/home/wr/anaconda2/lib/python2.7/site-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1


Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:


Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.

Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]


Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 


Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: y
Searching for NDK and SDK installations.

Please specify the home path of the Android NDK to use. [Default is /home/wr/Android/Sdk/ndk-bundle]: /home/wr/Android_Workspace/SDK/ndk-bundle


The path /home/wr/Android_Workspace/SDK/ndk-bundle or its child file "source.properties" does not exist.
Please specify the home path of the Android NDK to use. [Default is /home/wr/Android/Sdk/ndk-bundle]: /home/wr/Android_Workspace/SDK/ndk-bundle


Writing android_ndk_workspace rule.
Please specify the home path of the Android SDK to use. [Default is /home/wr/Android/Sdk]: /home/wr/Android_Workspace/SDK


Please specify the Android SDK API level to use. [Available levels: ['25', '26', '27']] [Default is 27]: 25


Please specify an Android build tools version to use. [Available versions: ['25.0.3', '26.0.2', '27.0.3']] [Default is 27.0.3]: 25.0.3


Writing android_sdk_workspace rule.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
Configuration finished
wr@wr-FX503VM:~/Tensorflow_Workspace/tensorflow$ 

}

然后用bazel编译即可。

4、编译安装TensorFlow

编译完成后,要安装编译出来的安装包

(https://blog.csdn.net/briliantly/article/details/79566013)
(pip install -U mock )
通过pip安装:

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg  
20180427日 星期五 13:38:37 CST : === Output wheel file is in: /tmp/tensorflow_pkg


wr@wr-FX503VM:~/Tensorflow_Workspace/tensorflow$ cd ///tmp/tensorflow_pkg
wr@wr-FX503VM:/tmp/tensorflow_pkg$ ls
tensorflow-1.8.0rc1-cp27-cp27mu-linux_x86_64.whl

特别注意的是:不要用 “sudo pip install”, 否则会使用 /usr/local/lib/python2.7, 而不是 Anaconda

pip install /tmp/tensorflow_pkg/tensorflow-1.8.0rc1-cp27-cp27mu-linux_x86_64.whl

wr@wr-FX503VM:~$ pip install /tmp/tensorflow_pkg/tensorflow-1.8.0rc1-cp27-cp27mu-linux_x86_64.whl
Processing /tmp/tensorflow_pkg/tensorflow-1.8.0rc1-cp27-cp27mu-linux_x86_64.whl
Collecting protobuf>=3.4.0 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/9d/61/54c3a9cfde6ffe0ca6a1786ddb8874263f4ca32e7693ad383bd8cf935015/protobuf-3.5.2.post1-cp27-cp27mu-manylinux1_x86_64.whl (6.4MB)
    100% |████████████████████████████████| 6.4MB 2.2MB/s 
Collecting astor>=0.6.0 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/b2/91/cc9805f1ff7b49f620136b3a7ca26f6a1be2ed424606804b0fbcf499f712/astor-0.6.2-py2.py3-none-any.whl
Collecting backports.weakref>=1.0rc1 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/88/ec/f598b633c3d5ffe267aaada57d961c94fdfa183c5c3ebda2b6d151943db6/backports.weakref-1.0.post1-py2.py3-none-any.whl
Requirement already satisfied: wheel in ./anaconda2/lib/python2.7/site-packages (from tensorflow==1.8.0rc1) (0.30.0)
Requirement already satisfied: mock>=2.0.0 in ./anaconda2/lib/python2.7/site-packages (from tensorflow==1.8.0rc1) (2.0.0)
Requirement already satisfied: enum34>=1.1.6 in ./anaconda2/lib/python2.7/site-packages (from tensorflow==1.8.0rc1) (1.1.6)
Collecting gast>=0.2.0 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/5c/78/ff794fcae2ce8aa6323e789d1f8b3b7765f601e7702726f430e814822b96/gast-0.2.0.tar.gz
Collecting termcolor>=1.1.0 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Collecting absl-py>=0.1.6 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/90/6b/ba04a9fe6aefa56adafa6b9e0557b959e423c49950527139cb8651b0480b/absl-py-0.2.0.tar.gz (82kB)
    100% |████████████████████████████████| 92kB 6.4MB/s 
Collecting tensorboard<1.8.0,>=1.7.0 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/6e/5b/18f50b69b8af42f93c47cd8bf53337347bc1974480a10de51fdd7f8fd48b/tensorboard-1.7.0-py2-none-any.whl (3.1MB)
    100% |████████████████████████████████| 3.1MB 3.5MB/s 
Requirement already satisfied: six>=1.10.0 in ./anaconda2/lib/python2.7/site-packages (from tensorflow==1.8.0rc1) (1.11.0)
Collecting grpcio>=1.8.6 (from tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/0d/54/b647a6323be6526be27b2c90bb042769f1a7a6e59bd1a5f2eeb795bfece4/grpcio-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl (8.7MB)
    100% |████████████████████████████████| 8.7MB 3.1MB/s 
Requirement already satisfied: numpy>=1.13.3 in ./anaconda2/lib/python2.7/site-packages (from tensorflow==1.8.0rc1) (1.14.0)
Requirement already satisfied: setuptools in ./anaconda2/lib/python2.7/site-packages (from protobuf>=3.4.0->tensorflow==1.8.0rc1) (38.4.0)
Requirement already satisfied: funcsigs>=1; python_version < "3.3" in ./anaconda2/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow==1.8.0rc1) (1.0.2)
Requirement already satisfied: pbr>=0.11 in ./anaconda2/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow==1.8.0rc1) (4.0.2)
Collecting bleach==1.5.0 (from tensorboard<1.8.0,>=1.7.0->tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/33/70/86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90/bleach-1.5.0-py2.py3-none-any.whl
Requirement already satisfied: futures>=3.1.1; python_version < "3" in ./anaconda2/lib/python2.7/site-packages (from tensorboard<1.8.0,>=1.7.0->tensorflow==1.8.0rc1) (3.2.0)
Collecting markdown>=2.6.8 (from tensorboard<1.8.0,>=1.7.0->tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/6d/7d/488b90f470b96531a3f5788cf12a93332f543dbab13c423a5e7ce96a0493/Markdown-2.6.11-py2.py3-none-any.whl (78kB)
    100% |████████████████████████████████| 81kB 5.3MB/s 
Requirement already satisfied: werkzeug>=0.11.10 in ./anaconda2/lib/python2.7/site-packages (from tensorboard<1.8.0,>=1.7.0->tensorflow==1.8.0rc1) (0.14.1)
Collecting html5lib==0.9999999 (from tensorboard<1.8.0,>=1.7.0->tensorflow==1.8.0rc1)
  Downloading https://files.pythonhosted.org/packages/ae/ae/bcb60402c60932b32dfaf19bb53870b29eda2cd17551ba5639219fb5ebf9/html5lib-0.9999999.tar.gz (889kB)
    100% |████████████████████████████████| 890kB 3.7MB/s 
Building wheels for collected packages: gast, termcolor, absl-py, html5lib
  Running setup.py bdist_wheel for gast ... done
  Stored in directory: /home/wr/.cache/pip/wheels/9a/1f/0e/3cde98113222b853e98fc0a8e9924480a3e25f1b4008cedb4f
  Running setup.py bdist_wheel for termcolor ... done
  Stored in directory: /home/wr/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6
  Running setup.py bdist_wheel for absl-py ... done
  Stored in directory: /home/wr/.cache/pip/wheels/23/35/1d/48c0a173ca38690dd8dfccfa47ffc750db48f8989ed898455c
  Running setup.py bdist_wheel for html5lib ... done
  Stored in directory: /home/wr/.cache/pip/wheels/50/ae/f9/d2b189788efcf61d1ee0e36045476735c838898eef1cad6e29
Successfully built gast termcolor absl-py html5lib
grin 1.2.1 requires argparse>=1.1, which is not installed.
Installing collected packages: protobuf, astor, backports.weakref, gast, termcolor, absl-py, html5lib, bleach, markdown, tensorboard, grpcio, tensorflow
  Found existing installation: html5lib 1.0.1
    Uninstalling html5lib-1.0.1:
      Successfully uninstalled html5lib-1.0.1
  Found existing installation: bleach 2.1.2
    Uninstalling bleach-2.1.2:
      Successfully uninstalled bleach-2.1.2
Successfully installed absl-py-0.2.0 astor-0.6.2 backports.weakref-1.0.post1 bleach-1.5.0 gast-0.2.0 grpcio-1.11.0 html5lib-0.9999999 markdown-2.6.11 protobuf-3.5.2.post1 tensorboard-1.7.0 tensorflow-1.8.0rc1 termcolor-1.1.0

5、测试

wr@wr-FX503VM:~$ python
Python 2.7.14 |Anaconda, Inc.| (default, Dec  7 2017, 17:05:42) 
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
/home/wr/anaconda2/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
>>> print(tf.__version__)
1.8.0-rc1
>>> 

四、Tensorflow的Android Demo

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android

bazel build -c opt //tensorflow/examples/android:tensorflow_demo
{
wr@wr-FX503VM:~/Tensorflow_Workspace/tensorflow$ bazel build -c opt //tensorflow/examples/android:tensorflow_demo
ERROR: /home/wr/Tensorflow_Workspace/tensorflow/WORKSPACE:105:1: no such package '@androidsdk//': Bazel requires Android build tools version 26.0.1 or newer, 25.0.3 was provided and referenced by '//external:android/dx_jar_import'
ERROR: Analysis of target '//tensorflow/examples/android:tensorflow_demo' failed; build aborted: Loading failed
INFO: Elapsed time: 2.783s
FAILED: Build did NOT complete successfully (13 packages loaded)
}

Sdk为:26.0.2
Ndk为:r14b

external/eigen_archive/unsupported/Eigen/CXX11/Tensor:84:10: fatal error: 'cuda_runtime.h' file not found

这里出错了,要重新为Android版配置一下TensorFlow

1、配置TensorFlow的Android环境

需要重新configure TensorFlow环境,
re configure

wr@wr-FX503VM:~/Tensorflow_Workspace/tensorflow$ ./configure 
You have bazel 0.12.0 installed.
Please specify the location of python. [Default is /home/wr/anaconda2/bin/python]: 


Found possible Python library paths:
  /home/wr/anaconda2/lib/python2.7/site-packages
Please input the desired Python library path to use.  Default is [/home/wr/anaconda2/lib/python2.7/site-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: n
No jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: n
No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: n
Clang will not be downloaded.

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 


The WORKSPACE file has at least one of ["android_sdk_repository", "android_ndk_repository"] already set. Will not ask to help configure the WORKSPACE. Please delete the existing rules to activate the helper.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
Configuration finished

然后再

bazel build -c opt //tensorflow/examples/android:tensorflow_demo
{
Target //tensorflow/examples/android:tensorflow_demo up-to-date:
  bazel-bin/tensorflow/examples/android/tensorflow_demo_deploy.jar
  bazel-bin/tensorflow/examples/android/tensorflow_demo_unsigned.apk
  bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
INFO: Elapsed time: 706.365s, Critical Path: 73.09s
INFO: Build completed successfully, 970 total actions

}

ok

2、通过Android Statio,bazel编译

打开android stadio,open project
/home/wr/Tensorflow_Workspace/tensorflow/tensorflow/examples/android

打开build.gradle文件,修改bazelLocation的位置:

//def bazelLocation = '/usr/local/bin/bazel'
def bazelLocation = '/home/wr/.bazel/bin/bazel'

build project

Ok

3、运行

略。

五、基于Inception V3的迁移学习

1、安装Pycharm
略。
2、Tensorboard
安装略。

使用Tensorboard查看TensorFlow的模型

python /home/wr/Tensorflow_Workspace/tensorflow/tensorflow/python/tools/import_pb_to_tensorboard.py --model_dir=tensorflow_inception_graph_flower.pb --log_dir=tensorboard_graph
tensorboard --logdir=tensorboard_graph

tensorboard --logdir=tensorboard_graph --debug

发现看不了,图片一直出不来,提示TypeError: GetNext()

find tensorboard_graph | grep tfevents

tensorboard –inspect –logdir tensorboard_graph/

后面上网查资料,发现这是tensorboard的bug:

{
问题:TypeError: GetNext() takes exactly 1 argument (2 given)

https://github.com/tensorflow/tensorboard/pull/1086/files/e303ebd339050756f451f033b15d75470d57e02a#diff-59cb290472c659c40df2436665c48aae

将网上的改动做到
/home/wr/anaconda2/lib/python2.7/site-packages/tensorboard/backend/event_processing/event_file_loader.py
ok
}

python /home/wr/Tensorflow_Workspace/tensorflow/tensorflow/python/tools/import_pb_to_tensorboard.py --model_dir=tensorflow_inception_graph_flower.pb --log_dir=tensorboard_graph
tensorboard --logdir=tensorboard_graph
python /home/wr/Tensorflow_Workspace/tensorflow/tensorflow/python/tools/import_pb_to_tensorboard.py --model_dir=tensorflow_inception_graph_v3_flower_striped.pb --log_dir=tensorboard_graph
tensorboard --logdir=tensorboard_graph

然后根据提示用浏览器,
可以看到不同模型的结构图。

3、在PC上训练识别花的模型,基于Inception V3模型的迁移

因为一个训练数据会被使用多次,所以可以将原始图像通过Inception-v3模型计算得到的特征向量保存在文件中。

图像输入张量所对应的名称:
JPEG_DATA_TENSOR_NAME = ‘DecodeJpeg/contents:0’

//Inception-v3模型中代表瓶颈层结果的张量名称。
在谷歌提出的Inception-v3模型中,这个张量名称就是’pool_3/_reshape:0’。
在训练模型时,可以通过tensor.name来获取张量的名称。
BOTTLENECK_TENSOR_NAME = ‘pool_3/_reshape:0’

将花的图片经过了Inception-v3的模型计算得到特征向量,然后再简单的组建线性分类的一层全连接层。
因为训练好的Inception-v3模型已经将原始的图片抽象为了更加容易分类的特征向量了,所以不需要再训练那么复杂的神经网络来完成这个新的分类任务。

代码请参考transfer_flower.py

教训
生成pb模型

constant_graph=graph_util.convert_variables_to_constants(sess,sess.graph_def, ["final_training_ops/Softmax"])
with gfile.FastGFile(os.path.join(MODEL_OUT_DIR, MODEL_FILE_OUT), mode='wb') as f:
f.write(constant_graph.SerializeToString())

convert_variables_to_constants这一步相当重要,没有这一步就没用将variables保存到模型中;我们辛辛苦苦训练模型,要得不就是variables吗~
如果在android加载了没有variables的模型,就会不断要求你初始化variables~ 而variables是模型的关键,没有保存,在android初始化了又有什么意义呢~

4、迁移结构

两个模型结合(v3, flower ),对于V3,我们取得
v3:import “Mul” output “pool_3/_reshape”

然后用v3 的瓶颈层output 来填入flower 的import
feed “pool_3/_reshape” > “BottleneckInputPlaceholder”

对于flower 我们有
flower: import “BottleneckInputPlaceholder” output “final_training_ops/Softmax”

5、strip_unused

bazel build tensorflow/python/tools:strip_unused && \
bazel-bin/tensorflow/python/tools/strip_unused \
--input_graph=/home/wr/Tensorflow_Workspace/code/transfer_learning/model/tensorflow_inception_graph_v3.pb \
--output_graph=/home/wr/Tensorflow_Workspace/code/transfer_learning/model/tmp/tensorflow_inception_graph_v3_bootleneck_striped.pb \
--input_node_names="Mul"  \
--output_node_names="pool_3/_reshape"  \
--input_binary=true

bazel build tensorflow/python/tools:strip_unused && \
bazel-bin/tensorflow/python/tools/strip_unused \
--input_graph=/home/wr/Tensorflow_Workspace/code/transfer_learning/model/tmp/tensorflow_inception_graph_flower.pb \
--output_graph=/home/wr/Tensorflow_Workspace/code/transfer_learning/model/tmp/tensorflow_inception_graph_v3_flower_striped.pb \
--input_node_names="BottleneckInputPlaceholder"  \
--output_node_names="final_training_ops/Softmax"  \
--input_binary=true

用tensorboard检查:

/home/wr/Tensorflow_Workspace/code/transfer_learning/model/tmp

python /home/wr/Tensorflow_Workspace/tensorflow/tensorflow/python/tools/import_pb_to_tensorboard.py --model_dir=tensorflow_inception_graph_v3_bootleneck_striped.pb --log_dir=tensorboard_graph
tensorboard --logdir=tensorboard_graph

python /home/wr/Tensorflow_Workspace/tensorflow/tensorflow/python/tools/import_pb_to_tensorboard.py --model_dir=tensorflow_inception_graph_v3_flower_striped.pb --log_dir=tensorboard_graph
tensorboard --logdir=tensorboard_graph

这样就分别获取到了两个模型,我们简称为一个是V3一个是Flower。

6、Java

分别加载两个模型,

private static final String V3_BOTTLENECK_INPUT_NAME = "Mul"; //Mul
private static final String V3_BOTTLENECK_OUTPUT_NAME = "pool_3/_reshape"; //
private static final String V3_BOTTLENECK_MODEL_FILE = "file:///android_asset/tensorflow_inception_graph_v3_bootleneck_striped.pb";

private static final String FLOWER_INPUT_NAME = "BottleneckInputPlaceholder"; //Mul
private static final String FLOWER_OUTPUT_NAME = "final_training_ops/Softmax"; //
private static final String FLOWER_MODEL_FILE = "file:///android_asset/tensorflow_inception_graph_v3_flower_striped.pb";
private static final String FLOWER_LABEL_FILE =
          "file:///android_asset/imagenet_comp_graph_label_strings_flower.txt";

首先过V3模型,获得瓶颈层的输出
inferenceNeckInterface.feed(neckInputNeckName, floatValues, 1, inputSize, inputSize, 3);
inferenceNeckInterface.run(neckOnputNames, logStats);
inferenceNeckInterface.fetch(neckOutputNeckName, neck_outputs);

然后将瓶颈层的输出,填入Flower模型
inferenceInterface.feed(inputName, neck_outputs, 1, neckSize);
inferenceInterface.run(outputNames, logStats);
inferenceInterface.fetch(outputName, outputs);
就可以获得结果了。

代码请参考InceptionV3Classifier.java 和 ClassifierActivity.java

7、运行

略。

六、Single Shot MultiBox Detector

将ssd_mobilenet_v1换为ssd_mobilenet_v2
参考
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

下载ssd_mobilenet_v2.pb,同时直接替换DetectorActivity.java里的TF_OD_API_MODEL_FILE,

private static final int TF_OD_API_INPUT_SIZE = 300;
//  private static final String TF_OD_API_MODEL_FILE =
//      "file:///android_asset/ssd_mobilenet_v1_android_export.pb";
  private static final String TF_OD_API_MODEL_FILE =
          "file:///android_asset/ssd_mobilenet_v2.pb";
  private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/coco_labels_list_v2.txt";

同时

private static final DetectorMode MODE = DetectorMode.TF_OD_API;

Ok.

猜你喜欢

转载自blog.csdn.net/miller1026/article/details/80652696