Caffe
是一个清晰而高效的深度学习框架,是纯粹的C++、CUDA架构,支持命令行,Python和MATLAB接口,可以在CPU和GPU直接无缝切换,Caffe的优势:
- 上手快,模型和相应优化都是以文本形式而非代码形式给出,
Caffe
给出了模型的定义,最优化设置以及预训练的权重,方便立即上手。 - 速度快,Caffe与cuDNN结合使用,能够运行最棒的模型和海量的数据。
- 模块化,方便拓展新的认知和设置.
- 开源,开放
本文主要用于记录在MacBookPro笔记本电脑中安装Caffe(CPU-Only)框架。并使用最简单的LeNet识别的Mnist手写数字训练集:http://caffe.berkeleyvision.org/gathered/examples/mnist.html
1.安装Homebrew
打开你的terminal~输入
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
2.安装CMake
brew install cmake
3.安装依赖
terminal输入
brew install git openblas python brew install --fresh -vd snappy leveldb gflags glog szip hdf5 lmdb homebrew/science/opencv brew install --fresh -vd --with-python protobuf brew install --fresh -vd boost boost-python
我这里安装的是python3版本:
brew install python3
4.安装Caffe
下载 Caffe 并修改配置
首先进入安装目录
cd /usr/local/Cellar下载并修改:
git clone https://github.com/BVLC/caffe.git cd caffe cp Makefile.config.example Makefile.config
其实我们用的CMake不用改Makefile.config可是心理作用还是改一下吧,找到Makefile.config(刚copy出来的)搜索CPU_ONLY := 1 ,取消注释
修改后的 Makefile.config
## Refer to http://caffe.berkeleyvision.org/installation.html # Contributions simplifying and improving our build system are welcome! # cuDNN acceleration switch (uncomment to build with cuDNN). # USE_CUDNN := 1 # CPU-only switch (uncomment to build without GPU support). CPU_ONLY := 1 # uncomment to disable IO dependencies and corresponding data layers # USE_OPENCV := 0 # USE_LEVELDB := 0 # USE_LMDB := 0 # uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary) # You should not set this flag if you will be reading LMDBs with any # possibility of simultaneous read and write # ALLOW_LMDB_NOLOCK := 1 # Uncomment if you're using OpenCV 3 OPENCV_VERSION := 3 # To customize your choice of compiler, uncomment and set the following. # N.B. the default for Linux is g++ and the default for OSX is clang++ # CUSTOM_CXX := g++ # CUDA directory contains bin/ and lib/ directories that we need. # CUDA_DIR := /usr/local/cuda # On Ubuntu 14.04, if cuda tools are installed via # "sudo apt-get install nvidia-cuda-toolkit" then use this instead: # CUDA_DIR := /usr # CUDA architecture setting: going with all of them. # For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility. # For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility. # For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility. # CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \ # -gencode arch=compute_35,code=sm_35 \ # -gencode arch=compute_50,code=sm_50 \ # -gencode arch=compute_52,code=sm_52 \ # -gencode arch=compute_60,code=sm_60 \ # -gencode arch=compute_61,code=sm_61 \ # -gencode arch=compute_61,code=compute_61 # BLAS choice: # atlas for ATLAS (default) # mkl for MKL # open for OpenBlas BLAS := atlas # Custom (MKL/ATLAS/OpenBLAS) include and lib directories. # Leave commented to accept the defaults for your choice of BLAS # (which should work)! # BLAS_INCLUDE := /path/to/your/blas # BLAS_LIB := /path/to/your/blas # Homebrew puts openblas in a directory that is not on the standard search path # BLAS_INCLUDE := $(shell brew --prefix openblas)/include # BLAS_LIB := $(shell brew --prefix openblas)/lib # This is required only if you will compile the matlab interface. # MATLAB directory should contain the mex binary in /bin. # MATLAB_DIR := /usr/local # MATLAB_DIR := /Applications/MATLAB_R2012b.app # NOTE: this is required only if you will compile the python interface. # We need to be able to find Python.h and numpy/arrayobject.h. # PYTHON_INCLUDE := /usr/include/python2.7 \ # /usr/lib/python2.7/dist-packages/numpy/core/include PYTHON_INCLUDE := /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/include \ /usr/local/Cellar/numpy/1.14.2/lib/python3.6/site-packages/numpy/core/include # Anaconda Python distribution is quite popular. Include path: # Verify anaconda location, sometimes it's in root. # ANACONDA_HOME := $(HOME)/anaconda # PYTHON_INCLUDE := $(ANACONDA_HOME)/include \ # $(ANACONDA_HOME)/include/python2.7 \ # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include # Uncomment to use Python 3 (default is Python 2) # PYTHON_LIBRARIES := boost_python3 python3.5m # PYTHON_INCLUDE := /usr/include/python3.5m \ # /usr/lib/python3.5/dist-packages/numpy/core/include # We need to be able to find libpythonX.X.so or .dylib. PYTHON_LIB := /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib # PYTHON_LIB := $(ANACONDA_HOME)/lib # Homebrew installs numpy in a non standard path (keg only) # PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include # PYTHON_LIB += $(shell brew --prefix numpy)/lib # Uncomment to support layers written in Python (will link against Python libs) WITH_PYTHON_LAYER := 1 # Whatever else you find you need goes here. INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/local/Cellar/opencv/3.4.1_2/include LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/local/Cellar/opencv/3.4.1_2/lib # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies # INCLUDE_DIRS += $(shell brew --prefix)/include # LIBRARY_DIRS += $(shell brew --prefix)/lib # NCCL acceleration switch (uncomment to build with NCCL) # https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0) # USE_NCCL := 1 # Uncomment to use `pkg-config` to specify OpenCV library paths. # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.) # USE_PKG_CONFIG := 1 # N.B. both build and distribute dirs are cleared on `make clean` BUILD_DIR := build DISTRIBUTE_DIR := distribute # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171 # DEBUG := 1 # The ID of the GPU that 'make runtest' will use to run unit tests. TEST_GPUID := 0 # enable pretty build (comment to see full commands) Q ?= @
安装caffe的python接口
cd /usr/local/Cellar/caffe
$ for req in $(cat requirements.txt); do pip3 install $req ; done
设置python环境变量
vi ~/.bash_profile
# Setting PATH for Python 3.6 # The original version is saved in .bash_profile.pysave PATH="/Library/Frameworks/Python.framework/Versions/3.6/bin:${PATH}" # PATH="/usr/local/bin/python3${PATH}" export PATH
export PYTHONPATH=/usr/local/Cellar/caffe/python:$PYTHONPATH
然后,让修改立即生效
$ source ~/.bash_profile
5.安装
mkdir build cd build cmake .. 打开CMakeCache.txt,将 CPU_ONLY:BOOL= 赋值ON。 打开CaffeConfig.cmake, 找到set(CPU_ONLY, OFF),同样改成ON。 注意:不要在caffe目录下去执行make;会一直报错;
6.编译
make all make install make runtest
7.测试mnist
$ cd caffe $ ./data/mnist/get_mnist.sh #下载MNIST数据库并解压缩 $ ./examples/mnist/create_mnist.sh #将其转换成Lmdb数据库格式 $ vi examples/mnist/lenet_solver.prototxt # 设置solver_mode: CPU $ ./examples/mnist/train_lenet.sh # 训练网络</code>
在正式开始训练和测试我们的模型之前,先对LeNet有一个大致了解,如下图所示,它由一个卷积层、后面跟一个下采样层、再跟另外一个卷积层和另一个下采样层,再之后是两个全连接层组成。这里caffe中用的示例和original LeNet的区别是使用ReLU(Rectified Linear Unit)取代了sigmoid激活函数。
LeNet各层的属性在$CAFFE_ROOT/examples/mnist/lenet_train_test.prototxt中进行了定义。
vi ./examples/mnist/lenet_train_test.prototxt命令即可查看网络各层的定义。
name: "LeNet" //网络名称是LeNet layer { name: "mnist" //数据层名称是mnist type: "Data" //类型是数据 top: "data" //输出数据到两个Blob,data和label top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 //确保输出数据在[0,1)之间,所以乘以1/256 } data_param { source: "examples/mnist/mnist_train_lmdb" //从这里获得数据 batch_size: 64 //每批大小是64 backend: LMDB } } layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "examples/mnist/mnist_test_lmdb" batch_size: 100 //每批大小是100 backend: LMDB } } 然后是第一个卷积层和下采样层: layer { name: "conv1" type: "Convolution" bottom: "data" //以下层传输过来的data Blob作为输入 top: "conv1" //这层数据输出到Blob conv1 param { lr_mult: 1 //lr为learning rate,学习率 } param { lr_mult: 2 //bias的学习率是weight的两倍 } convolution_param { num_output: 20 //输出有20个channel kernel_size: 5 //卷积核大小为5 stride: 1 //卷积步长为1 weight_filler { type: "xavier" //使用xavier algorithm,根据输入和输出神经元的数目,自动确定初始化权重的范围 } bias_filler { type: "constant" //将偏置初始化为常数,且为0 } } } layer { name: "pool1" type: "Pooling" //层的类型是Pooling bottom: "conv1" //输入是conv1 Blob top: "pool1" //输出是pool1 Blob pooling_param { pool: MAX //下采样方式是最大值采样 kernel_size: 2 //在2*2的区域内选择最大值 stride: 2 //步长为2,防止区域有重叠 } } 第二个卷积层和下采样层也都是类似的,就不再赘述了,下面是两个全连接层: layer { name: "ip1" type: "InnerProduct" //Fully Connection Layer在caffe中也叫Inner Product bottom: "pool2" //输入是pool2 Blob top: "ip1" //输出是ip1 Blob param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 //输出的神经元个数为500 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "relu1" type: "ReLU" bottom: "ip1" //把输入和输出的Blob设为同一个名字,可以是对单个元素操作的relu节省存储空间 top: "ip1" } 然后是另一个全连接层,不过只有10个输出,对应10个数字。接下来就是Loss层(和Accuracy层,只在test阶段使用): layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" //将全连接层的prediction和data层输出的label作为输入 bottom: "label" top: "loss" } 这一层没有进一步的输出,只计算损失函数值,当BP开始时将loss 报告出来。这个网络的定义就到此结束了。 此外,还有一点需要注意的是,当如下的格式出现时, layer { // ...layer definition... include: { phase: TRAIN } }
说明这一层只在TRAIN阶段出现在网络中,当处在TEST阶段时,这一层不出现在网络中。没有这个标志的层始终出现在网络当中。所以在以上的定义中,DATA层以不同的BATCH出现了两次,分别是TRAIN和TEST阶段。另外在测试阶段还有一个Accuracy层,每100次迭代就计算一下准确率。
再输入命令行:~/caffe-master$ vi ./examples/mnist/lenet_solver.prototxt,可以看到MNIST solver的配置情况:
The train/test net protocol buffer definition net: "examples/mnist/lenet_train_test.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 10000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "examples/mnist/lenet" # solver mode: CPU or GPU solver_mode: GPU
$ vi examples/mnist/lenet_solver.prototxt # 设置solver_mode: CPU
这里可以看到网络训练的配置,每批次训练100张图片,共100批次10000张图片,基础的学习率是0.1,使用GPU计算。因为这里的训练量较小,所以GPU的速度优势还看不太出来,如果在大一些的网络和训练集中,GPU的速度优势会更加明显
最后输入训练:
./examples/mnist/train_lenet.sh
就正式开始了训练和测试,正常情况下像MNIST这个级别的数据量应该几分钟就可以训练完。截取最后几行:
运行结果保存在了lenet_iter_10000.solverstate文件中。