入手tensorflow

还是老套路来搞新事物:

一. 安装TF (不是GPU版,也不是TPU版,更不是AI版)。我的系统是Centos7.5,选择官网linux版guide 如下:

https://www.tensorflow.org/install/install_linux

有代码就有江湖,注定不能好好安装的。填坑如下:

1、第3步 命令  easy_install -U pip 网络老超时,解决方法如下:

easy_install -i http://pypi.douban.com/simple/ -U pip

其实我的pip版本也刚好够的只是升级到更高版本。

升级前:

$ pip -V
    pip 9.0.1 from /home/xxx/test/tensorflow/lib/python2.7/site-packages (python 2.7)

升级后:

$ pip -V
    pip 18.0 from /home/xxx/test/tensorflow/lib/python2.7/site-packages/pip-18.0-py2.7.egg/pip (python 2.7)

2、第5步 “ pip install --upgrade tensorflow ”出错,这次加 -i http://pypi.douban.com/simple/ 也没救了。

2.1)首先是在aarch64平台上的,需要修改.whl包路径。 参考ARM 64-bit上安装Tensorflow框架(https://blog.csdn.net/qq_31261509/article/details/79835136),可知路径应该为tensorflow-1.8.0-cp27-none-linux_aarch64.whl。于是命令改为:pip install --upgrade  https://download.tensorflow.google.cn/linux/cpu/tensorflow-1.8.0-cp27-none-linux_aarch64.whl。但是download.tensorflow.google.cn下载速率只有1.xKB/s,直接被timeout了。

2.2) 想办法提网速,参考访问github加速(Windows + linux)(https://blog.csdn.net/senver_wen/article/details/80834652)。于是命令更新为pip install --upgrade https://github.com/lhelontra/tensorflow-on-arm/releases/download/v1.8.0/tensorflow-1.8.0-cp27-none-linux_aarch64.whl

#加速github访问
a. 修改/etc/hosts
    13.229.188.59   www.github.com
    151.101.228.133 assets-cdn.github.com
    151.101.73.194  github.global.ssl.fastly.net
b. 安装nscd 和 刷新DNS (CentOS7.5)
    yum install nscd
    systemctl restart nscd

2.3) 在tensorflow-1.8.0-cp27-none-linux_aarch64.whl下载完后,很快就报错了。

Collecting numpy>=1.13.3 (from tensorflow==1.8.0)
  Could not find a version that satisfies the requirement numpy>=1.13.3 (from tensorflow==1.8.0) (from versions: )
No matching distribution found for numpy>=1.13.3 (from tensorflow==1.8.0)

推测是numpy安装的问题。于是手动安装,pip install numpy 。继续用站长工具(http://tool.chinaz.com/dns )找出files.pythonhosted.org 合适的DNS。如下:

151.101.73.63   files.pythonhosted.org

重复2.2 中的 a. 和 b. 步骤,于是很快安装好了numpy(这厮是python的一个包,用于科学计算的).

2.4)Finally,看到了如下提示信息:

Successfully installed absl-py-0.3.0 astor-0.7.1 backports.weakref-1.0.post1 bleach-1.5.0 enum34-1.1.6 funcsigs-1.0.2 futures-3.2.0 gast-0.2.0 grpcio-1.14.1 html5lib-0.9999999 markdown-2.6.11 mock-2.0.0 pbr-4.2.0 protobuf-3.6.0 six-1.11.0 tensorboard-1.8.0 tensorflow-1.8.0 termcolor-1.1.0 werkzeug-0.14.1
(tensorflow) [xxxtest-3 bin]$ 

2.5)安装之后,打个Hello验证

https://www.tensorflow.org/install/install_linux#ValidateYourInstallation

# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

结果提示 需要 GLIBC_2.23 版本,我的系统目前是GLIBC2.17.

ImportError: /lib64/libm.so.6: version `GLIBC_2.23' not found (required by /home/xx/test/tensorflow-v1.8/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

尼玛,安装的时候不提示,等到我运行的时候才跳出来。

考虑到系统有其他人在用,尽量不升级libc.于是试着降低tensorflow的版本到1.4.结果还是一个鸟样。

ImportError: /lib64/libm.so.6: version `GLIBC_2.23' not found (required by /home/xxx/test/tensorflow-v1.4/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

2.6)或者尝试升级glibc到2.23.

参考帖子:https://blog.csdn.net/qq708986022/article/details/77896791

2.7)VM上安装Tensorflow (还是自己的环境靠谱)

VM上的Ubuntu的是python3,还是选择virtualenv安装,到了执行第5步 pip3 install --upgrade tensorflow 后,自动去下载版本tensorflow-1.10.0-cp36-cp36m-manylinux1_x86_64.whl
log如下:
Downloading https://files.pythonhosted.org/packages/ee/e6/a6d371306c23c2b01cd2cb38909673d17ddd388d9e4b3c0f6602bfd972c8/tensorflow-1.10.0-cp36-cp36m-manylinux1_x86_64.whl (58.4MB)
但依旧超时,于是想是不是可以设置pip3超时设置。
网上search如下:
 
$cat ~/.pip/pip.config
[global]
timeout = 6000
index-url = http://e.pypi.python.org/simple
trusted-host = pypi.douban.com
[install]
use-mirrors = true
mirrors = http://e.pypi.python.org

 

再一次提示安装成功。

Successfully built termcolor absl-py gast
launchpadlib 1.10.6 requires testresources, which is not installed.
Installing collected packages: werkzeug, setuptools, protobuf, numpy, markdown, tensorboard, astor, termcolor, absl-py, gast, grpcio, tensorflow
  Found existing installation: setuptools 40.0.0
    Uninstalling setuptools-40.0.0:
      Successfully uninstalled setuptools-40.0.0
  Found existing installation: protobuf 3.0.0
    Not uninstalling protobuf at /usr/lib/python3/dist-packages, outside environment /home/xxx/tensorflow
    Can't uninstall 'protobuf'. No files were found to uninstall.
  Found existing installation: numpy 1.15.0
    Uninstalling numpy-1.15.0:
      Successfully uninstalled numpy-1.15.0
Successfully installed absl-py-0.3.0 astor-0.7.1 gast-0.2.0 grpcio-1.14.1 markdown-2.6.11 numpy-1.14.5 protobuf-3.6.0 setuptools-39.1.0 tensorboard-1.10.0 tensorflow-1.10.0 termcolor-1.1.0 werkzeug-0.14.1
(tensorflow) -VirtualBox:~/tensorflow$ 

按官网的验证安装,跑一个Hello TensorFlow,成功了。

$ python
Python 3.6.5 (default, Apr  1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2018-08-13 17:51:02.640633: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
>>> 

到此处,tensorflow安装成功!!!

官网python3.6 支持CPU/GPU 的包如下:

仅支持 CPU:https://download.tensorflow.google.cn/linux/cpu/tensorflow-1.8.0-cp36-cp36m-linux_x86_64.whl

支持 GPU:https://download.tensorflow.google.cn/linux/gpu/tensorflow_gpu-1.8.0-cp36-cp36m-linux_x86_64.whl

二.  跑一把TF的Benchmark

https://www.tensorflow.org/performance/benchmarks

1. 分析TF的框架及学习一个关键部分、感兴趣的部分,当然TF的原生态语言-python 也需要熟悉(TF目前还支持C/Java/go语言)

三、目前问题如下:

1. Tensorflow 怎么关联大数据?使用什么方法作为输入? 

参考部署https://www.tensorflow.org/deploy/,官网:How to run TensorFlow on Hadoop, which has a highly self-explanatory title.(https://www.tensorflow.org/deploy/hadoop)

filename_queue = tf.train.string_input_producer([
    "hdfs://namenode:8020/path/to/file1.csv",
    "hdfs://namenode:8020/path/to/file2.csv",
])

2. Tensorflow 怎么布局在分布式系统上?

参考部署https://www.tensorflow.org/deploy/,官网:Distributed TensorFlow, which explains how to create a cluster of TensorFlow servers. (https://www.tensorflow.org/deploy/distributed)

tf.train.ClusterSpec construction Available tasks
 
tf.train.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
/job:local/task:0
/job:local/task:1
 

tf.train.ClusterSpec({

"worker": [ "worker0.example.com:2222", "worker1.example.com:2222", "worker2.example.com:2222" ],

"ps": [ "ps0.example.com:2222", "ps1.example.com:2222" ]})

3. Tensorflow 怎么跑在 CPU + AI 芯片上?目前是CPU 或者 CPU+GPU 或者 CPU+TPU?

四、 深入学习:

1. TensorFlow 功能广泛,但是主要用于构建深度神经网络模型(DNN各种模型介绍:https://blog.csdn.net/scutjy2015/article/details/74170794)。要开始使用 TensorFlow,最简单的方法就是使用 Eager Execution

官网的guide很全----https://www.tensorflow.org

猜你喜欢

转载自blog.csdn.net/don_chiang709/article/details/81512122