ubuntu14.04+caffe+cuda8.0+cudnn-8.0-v5.1(gtx1070)安装与测试

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Ai_Smith/article/details/53000973

煎熬了好多天,总算摸索出来,不同机器需要版本可能不同,步骤是如下了。另外吐槽下NVIDIA这后续软件能不能兼容下老版本啊,搞的好多不匹配出错。

第一部分,准备材料(NVIDIA官网下载),放到Downloads目录下:

显卡驱动:NVIDIA-Linux-x86_64-367.44.run

Cuda8.0:cuda_8.0.27_linux.run

Cudnn:cudnn-8.0-linux-x64-v5.1-prod.tgz

第二部分,安装步骤

2.1系统安装

系统选择ubuntu14.04,下载后ultrISO制作到U盘安装,不细说了。关闭系统更新。

2.2、安装依赖

安装编译工具:

$sudo apt-get install build-essential  # basic requirement  

$sudo apt-get install cmake git vim

$sudo apt-get update    #update source

安装依赖项: $sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler 
$sudo apt-get install --no-install-recommends libboost-all-dev
$sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
$sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

$sudo apt-get install python-numpypython-scipy python-matplotlib

2.3、禁用nouveau驱动

ALT+CTRL+F1,进命令行;

$sudo service lightdm stop

$sudo apt-get remove --purdg nvidia*
新建黑名单,禁止系统自带驱动:$sudo vi /etc/modprobe.d/blacklist-nouveau.conf
输入: blacklist nouveau
    options nouveau modset=0
保存推出(:wq)
然后执行:$sudo update-initramfs –u   #更新内核
执行 $lspci | grep nouveau,查看是否有内容,没有说明禁用成功,如果有内容,就$sudo reboot
重启:$sudo reboot
重启后,在登录界面,不要登录进桌面,直接ALT+CTRL+F1进命令行
2.4、安装cuda8.0

$sudo service lightdm stop

进入cuda_8.0.27_linux.run所在目录
$cd /home/smith/Downloads
$sudo chmod +x cuda_8.0.27_linux.run 
$sudo ./cuda_8.0.27_linux.run
按q键退出RELU文档,按照如下选择,显卡驱动一定要选n,不装
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
(y)es/(n)o/(q)uit: n
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/zhou ]:
Installing the CUDA Toolkit in /usr/local/cuda-8.0 …
完成后看到
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/zhou, but missing recommended libraries
最后,配置环境变量,直接放在系统配置文件profile里面:
$sudo vim /etc/profile
在最后面加入两行代码:
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
保存退出.
执行:$sudo ldconfig
此时,显卡驱动没装,等待下一步显卡驱动装好后检查cuda8.0是否装好。
 
关于卸载cuda:
$cd /usr/local/cuda-8.0/bin
$sudo ./uninstall_cuda_8.0.pl 
2.5、显卡驱动安装
进入显卡驱动目录
$cd /home/smith/Downloads
$sudo ./NVIDIA-Linux-x86_64-367.44.run
一路按照提示选择安装,具体不记得了,主要有接受协议,在系统内核注册,用新路径注册,更新X-server,安装完成后会自动回到命令行
重启电脑:$sudo reboot
输入密码进入桌面
2.6、检查之前的安装
此时在home目录下会出现文件夹NVIDIA_CUDA-8.0_Samples,打开终端,进入该目录:
$sudo make –j8  #编译samples,我电脑8线程,全开编译
等待2分钟左右,编译完成,执行下条指令:
$sudo ./1_Utilities/deviceQuery/deviceQuery
出现如下信息,cuda8.0安装成功(忘记截图了,下面信息是gtx670装cuda6.5的)
./deviceQuery Starting...  
CUDA Device Query (Runtime API) version (CUDART static linking)  
Detected 1 CUDA Capable device(s)  
Device 0: "GeForce GTX 670"  
  CUDA Driver Version / Runtime Version          6.5 / 6.5  
  CUDA Capability Major/Minor version number:    3.0  
  Total amount of global memory:                 4095 MBytes (4294246400 bytes)  
  ( 7) Multiprocessors, (192) CUDA Cores/MP:     1344 CUDA Cores  
  GPU Clock rate:                                1098 MHz (1.10 GHz)  
  Memory Clock rate:                             3105 Mhz  
  Memory Bus Width:                              256-bit  
  L2 Cache Size:                                 524288 bytes  
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)  
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers  
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers  
  Total amount of constant memory:               65536 bytes  
  Total amount of shared memory per block:       49152 bytes  
  Total number of registers available per block: 65536  
  Warp size:                                     32  
  Maximum number of threads per multiprocessor:  2048  
  Maximum number of threads per block:           1024  
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)  
  Maximum memory pitch:                          2147483647 bytes  
  Texture alignment:                             512 bdeclared as function returning an arrayytes  
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)  
  Run time limit on kernels:                     Yes  
  Integrated GPU sharing Host Memory:            No  
  Support host page-locked memory mapping:       Yes  
  Alignment requirement for Surfaces:            Yes  
  Device has ECC support:                        Disabled  
  Device supports Unified Addressing (UVA):      Yes  
  Device PCI Bus ID / PCI location ID:           1 / 0  
  Compute Mode:  
  < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >    
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GeForce GTX 670  
Result = PASS
可以看到,最后出现了PASS,安装cuda完成。
还可以:
$nvcc –-version
查看nvcc版本
$nvidia–smi
显示(具体机子不一样,这图不是我的):
 
 
 
 
2.7、Atlas安装
$sudo apt-get install libatlas-base-dev
实际上这步在之前安装依赖项时已经安装过了。
2.8、cuDNN安装
$tar -zxvf cudnn-8.0-linux-x64-v5.1-prod.tgz  
$cd cuda 
$sudo cp lib64/lib* /usr/local/cuda/lib64/  
$sudo cp include/cudnn.h /usr/local/cuda/include/
更新软连接:
$cd /usr/local/cuda/lib64/
$sudo chmod +r libcudnn.so.5.1.5
$sudo ln -sf libcudnn.so.5.1.5 libcudnn.so.5
$sudo ln -sf libcudnn.so.5 libcudnn.so
更新设置:
$sudo ldconfig
2.9拉取caffe源码
git clone https://github.com/BVLC/caffe.git
2.10.安装python的pip和easy_install,方便安装软件包(后面主要都是按照寒小阳老师的步骤了,感谢老师啊)
(超慢的下载。。。)
$wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py 
$sudo python ez_setup.py --insecure
$wget https://bootstrap.pypa.io/get-pip.py
$sudo python get-pip.py
2.11.安装python依赖(路径根据自己的目录可能要调一下)
$pip install pyopenssl ndg-httpsclient pyasn1
$cd caffe/python
先root,再执行 
$for req in $(cat requirements.txt); do pip install $req; done
这步安装也有点慢,别急,等会儿,先去干点别的 ^_^(干点别的回来还没好。。。)
出现等待超时无法下载的,重新执行一遍指令,多执行几次,我的执行多次还有没执行好的,用下面指令替代了
$ sudo apt-get install python-sklearn python-skimage python-h5py python-protobuf python-leveldb python-networkx python-nose python-pandas python-gflags Cython ipython
2.12.编辑caffe所需的Makefile文件,配置
$cd caffe 
$cp Makefile.config.example Makefile.config 
$sudo gedit Makefile.config 
$Makefile.config里面有依赖库的路径,及各种编译配置,取消USE_CUDNN := 1的注释,开启GPU,USE_LMDB := 1,USE_LEVELDB := 1
配置运行环境,调用CUDA库,在/etc/ld.so.conf.d目录新建caffe.conf,
$sudo gedit /etc/ld.so.conf.d/caffe.conf
添加:
/usr/local/cuda/lib64
保存退出,执行:
$sudo ldconfig
2.13、编译caffe、pycaffe
进入caffe根目录:
$make –j8
测试一下结果:
$make test –j8
$make runtest –j8
(runtest中要全通过,我出现过个别不通过,运行正常,就是训练时准确率不提升)
$make pycaffe –j8
 
第三部分,拿cifar10测试下效果
$cd /home/smith/caffe
$sudo sh data/cifar10/get_cifar10.sh  (脚本下载速度太慢,找个迅雷下载拷进来,再照脚本解压)
# sudo sh examples/cifar10/create_cifar10.sh
# sudo sh examples/cifar10/train_quick.sh
下面,网络开始初始化、训练了,loss会开始下降,很快的就会出现优化完成。
 
 
 
PS: Python出现import caffe出错时,添加;
import sys
sys.path.append(“/home/smith/caffe/python”)

or: $export PYTHONPATH=$PYTHONPATH:/home/smith/caffe/python



猜你喜欢

转载自blog.csdn.net/Ai_Smith/article/details/53000973