【Tensorflow1.0+】记录常用函数

tf.train.ExponentialMovingAverage(decay, steps)

tf.train.ExponentialMovingAverage这个函数用于更新参数，就是采用滑动平均的方法更新参数。这个函数初始化需要提供一个衰减速率（decay），用于控制模型的更新速度。这个函数还会维护一个影子变量（也就是更新参数后的参数值），这个影子变量的初始值就是这个变量的初始值，影子变量值的更新方式如下：

shadow_variable = decay * shadow_variable + (1-decay) * variable

shadow_variable是影子变量，variable表示待更新的变量，也就是变量被赋予的值，decay为衰减速率。decay一般设为接近于1的数（0.99,0.999）。decay越大模型越稳定，因为decay越大，参数更新的速度就越慢，趋于稳定。

tf.train.ExponentialMovingAverage这个函数还提供了自己动更新decay的计算方式：

decay= min（decay，（1+steps）/（10+steps））

steps是迭代的次数，可以自己设定。

比如：

[python]view plain copy
 
import tensorflow as tf;  
import numpy as np;  
import matplotlib.pyplot as plt;  
  
v1 = tf.Variable(0, dtype=tf.float32)  
step = tf.Variable(tf.constant(0))  
  
ema = tf.train.ExponentialMovingAverage(0.99, step)  
maintain_average = ema.apply([v1])  
  
with tf.Session() as sess:  
    init = tf.initialize_all_variables()  
    sess.run(init)  
  
    print sess.run([v1, ema.average(v1)]) #初始的值都为0  
  
    sess.run(tf.assign(v1, 5)) #把v1变为5  
    sess.run(maintain_average)  
    print sess.run([v1, ema.average(v1)]) # decay=min(0.99, 1/10)=0.1, v1=0.1*0+0.9*5=4.5  
  
    sess.run(tf.assign(step, 10000)) # steps=10000  
    sess.run(tf.assign(v1, 10)) # v1=10  
    sess.run(maintain_average)  
    print sess.run([v1, ema.average(v1)]) # decay=min(0.99,(1+10000)/(10+10000))=0.99, v1=0.99*4.5+0.01*10=4.555  
  
    sess.run(maintain_average)  
    print sess.run([v1, ema.average(v1)]) #decay=min(0.99,<span style="font-family:Arial, Helvetica, sans-serif;">(1+10000)/(10+10000)</span><span style="font-family:Arial, Helvetica, sans-serif;">)=0.99, v1=0.99*4.555+0.01*10=4.6</span>  

输出：

[0.0, 0.0]
[5.0, 4.5]
[10.0, 4.5549998]
[10.0, 4.6094499]

解释：每次更新完以后，影子变量的值更新，varible的值就是你设定的值。如果在下一次运行这个函数的时候你不在指定新的值，那就不变，影子变量更新。如果指定，那就variable改变，影子变量也改变。

tf.trainable_variables：返回的是需要训练的变量列表

tf.all_variables：返回的是所有变量的列表

例如：

[python]view plain copy
 
import tensorflow as tf;    
import numpy as np;    
import matplotlib.pyplot as plt;    
  
v = tf.Variable(tf.constant(0.0, shape=[1], dtype=tf.float32), name='v')  
v1 = tf.Variable(tf.constant(5, shape=[1], dtype=tf.float32), name='v1')  
  
global_step = tf.Variable(tf.constant(5, shape=[1], dtype=tf.float32), name='global_step', trainable=False)  
ema = tf.train.ExponentialMovingAverage(0.99, global_step)  
  
for ele1 in tf.trainable_variables():  
    print ele1.name  
for ele2 in tf.all_variables():  
    print ele2.name  

输出：

v:0
v1:0

v:0
v1:0
global_step:0

分析：上面得到两个变量，后面的一个得到上三个变量，因为 global_step在声明的时候说明不是训练变量，用来关键字 trainable=False。

tf.control_dependencies()函数用法：

在有些机器学习程序中我们想要指定某些操作执行的依赖关系，这时我们可以使用tf.control_dependencies()来实现。
control_dependencies(control_inputs)返回一个控制依赖的上下文管理器，使用with关键字可以让在这个上下文环境中的操作都在control_inputs 执行。

with g.control_dependencies([a, b, c]):
  # `d` and `e` will only run after `a`, `b`, and `c` have executed.
  d = ...
  e = ...

可以嵌套control_dependencies 使用

with g.control_dependencies([a, b]):
  # Ops constructed here run after `a` and `b`.
  with g.control_dependencies([c, d]):
    # Ops constructed here run after `a`, `b`, `c`, and `d`.

可以传入None 来消除依赖：

with g.control_dependencies([a, b]):
  # Ops constructed here run after `a` and `b`.
  with g.control_dependencies(None):
    # Ops constructed here run normally, not waiting for either `a` or `b`.
    with g.control_dependencies([c, d]):
      # Ops constructed here run after `c` and `d`, also not waiting
      # for either `a` or `b`.

注意：
控制依赖只对那些在上下文环境中建立的操作有效，仅仅在context中使用一个操作或张量是没用的

# WRONG
def my_func(pred, tensor):
  t = tf.matmul(tensor, tensor)
  with tf.control_dependencies([pred]):
    # The matmul op is created outside the context, so no control
    # dependency will be added.
    return t

# RIGHT
def my_func(pred, tensor):
  with tf.control_dependencies([pred]):
    # The matmul op is created in the context, so a control dependency
    # will be added.
    return tf.matmul(tensor, tensor)

例子：
在训练模型时我们每步训练可能要执行两种操作，op a, b 这时我们就可以使用如下代码：

with tf.control_dependencies([a, b]):
    c= tf.no_op(name='train')#tf.no_op；什么也不做
sess.run(c)

在这样简单的要求下，可以将上面代码替换为：

c= tf.group([a, b])
sess.run(c)

【Tensorflow1.0+】记录常用函数

猜你喜欢