ubuntu18.04+RTX2080深度学习环境搭建

搭建环境弄了半天，主要是因为各个环境版本号的问题辗转腾挪了很久，再次记录一下最终成功run起来的软硬件版本号：

cpu 8700K，gpu rtx2080
ubuntu 18.04 64位
gcc g++版本调整为7.3（ubuntu18.04的自带版本）
NVIDIA驱动 410.78，用run文件方式安装！参考https://linuxconfig.org/how-to-install-the-nvidia-drivers-on-ubuntu-18-04-bionic-beaver-linux的Manual Install using the Official Nvidia.com driver
gcc g++版本调整为5
CUDA cuda_9.0.176.1_linux (ubuntu17.10 64位版，同时还有4个补丁！，参考https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/538832/)
cudnn libcudnn7_7.0.5.15-1+cuda9.0_amd64（参考6的链接，但附带examples的测试可略过）
python3.6
如果原环境已安装了tensorflow，卸载python环境原来的tensorflow，用pip install的方式安装tensorflow-gpu==1.8.0

附带说一下，用上面1-8的各软件版本，然后9换成tensorflow-gpu==1.12版本后进行训练时会提示：

E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.1.4. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

这说明tensorflow-gpu是和cudnn版本是存在对应关系的。貌似可以在此基础上直接升级cuda9.0对应的cudnn版本号，但是这里我没有在折腾了，以后有时间在捣鼓下。

那么这个cudnn的意义在于哪里呢？“通过 TensorFlow(或 Theano、CNTK)，Keras 可以在 CPU 和 GPU 上无缝运行。在 CPU 上运行时，TensorFlow 本身封装了一个低层次的张量运算库，叫作 Eigen;在 GPU 上运行时，TensorFlow 封装了一个高度优化的深度学习运算库，叫作 NVIDIA CUDA 深度神经网络库(cuDNN)”，可见这是用英伟达的gpu进行深度学习所依赖的重要组件。

彩蛋环节为搞完这一套软件后跑了个简单的tensorflow模型进行基准测试的结果：

Shape: (10000, 10000) Device: /cpu:0
Time taken: 0:00:05.056056

Shape: (10000, 10000) Device: /gpu:0
Time taken: 0:00:00.798914

可见即使是超频到5G的6核12线程的天梯图顶端的8700K在并行计算方面跟GPU的差距之大，大概就像空手道跆拳道之流和真正的中国功夫的差距那么大吧，哈哈！

ubuntu18.04+RTX2080深度学习环境搭建

猜你喜欢