Tensorflow加速计算（多GPU和分布式）

1 使用GPU基本操作

在Tensorflow中可以通过tf.device()函数来手动指定运行操作的设备。

对于CPU而言，在默认的情况下，即使的电脑有多个GPU，Tensorflow依然不会区分它们，都是用/cpu:0作为名称。

对于GPU而言，则会使用/gpu:0,/gpu:1,…/gpu:n

Tensorflow提供了一个快捷的方式来查看运行设备：在tf.ConfigProto中设置log_device_placement=True。

如果在具有GPU的环境中，会有限选择GPU

import tensorflow as tf
a = tf.constant([1.0, 2.0, 3.0], shape=[3], name='a')
b = tf.constant([1.0, 2.0, 3.0], shape=[3], name='b')
c = a + b
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))

[ 2.  4.  6.]

在IDE（pycharm）中还会输出：

add: (Add): /job:localhost/replica:0/task:0/gpu:0

a: (Const): /job:localhost/replica:0/task:0/gpu:0

b: (Const): /job:localhost/replica:0/task:0/gpu:0

显示了执行每一个运算的设备。

当然，可以通过手动指定运行设备来执行某些操作

import tensorflow as tf
with tf.device('/cpu:0'):
    a = tf.constant([1.0, 2.0, 3.0], shape=[3], name='a')
    b = tf.constant([1.0, 2.0, 3.0], shape=[3], name='b')
with tf.device('/gpu:0'):
    c = a + b
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess:
    print(sess.run(c))

[ 2.  4.  6.]

add: (Add): /job:localhost/replica:0/task:0/gpu:0

b: (Const): /job:localhost/replica:0/task:0/cpu:0

a: (Const): /job:localhost/replica:0/task:0/cpu:0

注：

不是所有的操作都可以放在CPU上运算：
（1） tf.one_hot 就需要在CPU上运行

（2）tf.variable() 函数在GPU上只支持float32 float64和double的参数

为了避免上述问题，tensorflow在tf.ConfigProto中提供了一个参数 allow_soft_placement参数，当设置为True时，如果运算无法有GPU执行，那么Tensorflow会自动将它放在CPU上执行。

虽然GPU可以加速tensorflow的计算，但是一般的不会将所有的计算都放在GPU上。一个比较好的实践时，将计算密集型的运算放在GPU上，将其他操作放在CPU上。

Tensorflow_GPU_1

Tensorflow加速计算（多GPU和分布式）

1 使用GPU基本操作

注：

猜你喜欢