tf.nn.dynamic_rnn的详解

tf.nn.dynamic_rnn

其和tf.nn.static_rnn,在输入,输出,参数上有很大的区别,请仔细阅读比较

tf.nn.dynamic_rnn(
    cell,
    inputs,
    sequence_length=None,
    initial_state=None,
    dtype=None,
    parallel_iterations=None,
    swap_memory=False,
    time_major=False,
    scope=None
)
'''
Args:
	cell: An instance of RNNCell.
	inputs: The RNN inputs. If time_major == False (default), this must be a Tensor of shape: [batch_size, max_time, ...], or a nested tuple of such elements. If time_major == True, this must be a Tensor of shape: [max_time, batch_size, ...], or a nested tuple of such elements. This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input to cell at each time-step will replicate the structure of these tuples, except for the time dimension (from which the time is taken). The input to cell at each time step will be a Tensor or (possibly nested) tuple of Tensors each with dimensions [batch_size, ...].
	sequence_length: (optional) An int32/int64 vector sized [batch_size]. Used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for performance than correctness.
	initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
	dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
	parallel_iterations: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer.
	swap_memory: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty.
	time_major: The shape format of the inputs and outputs Tensors. If true, these Tensors must be shaped [max_time, batch_size, depth]. If false, these Tensors must be shaped [batch_size, max_time, depth]. Using time_major = True is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major form.
	scope: VariableScope for the created subgraph; defaults to "rnn".
'''

参数说明

参数
  • cell :RNNCell实例
  • inputs :是一个tensor,如果time_major== False(默认),则张量的形状必须是:[batch_size,max_time,embed_size];如果time_major== True(默认),则张量的形状必须是:[max_time,batch_size,embed_size]
  • sequence_length: (可选)大小为[batch_size],数据的类型是int32/int64向量。如果当前时间步的index超过该序列的实际长度时,则该时间步不进行计算,RNN的state复制上一个时间步的,同时该时间步的输出全部为零。(Used to copy-through state and zero-out outputs when past a batch element’s sequence length.)
  • initial_state: (可选)RNN的初始state(状态)。如果cell.state_size(一层的RNNCell)是一个整数,那么它必须是一个具有适当类型和形状的张量[batch_size,cell.state_size]。如果cell.state_size是一个元组(多层的RNNCell,如MultiRNNCell),那么它应该是一个张量元组,每个元素的形状为[batch_size,s] for s in cell.state_size。
  • time_major: inputs 和outputs 张量的形状格式。如果为True,则这些张量都应该是(都会是)[max_time,batch_size,depth.]。如果为false,则这些张量都应该是(都会是)[batch_size,max_time,depth]。
返回

返回的是一个元组 (outputs, state)

  • outputs:RNN的最后一层的输出,是一个tensor
    如果为time_major== False,则shape [batch_size,max_time,cell.output_size]。如果为time_major== True,则shape: [max_time,batch_size,cell.output_size]。

  • state: RNN最后时间步的state,如果cell.state_size是一个整数(一般是单层的RNNCell),则state的shape:[batch_size,cell.state_size]。如果它是一个元组(一般这里是 多层的RNNCell),那么它将是一个具有相应形状的元组。注意:如果若RNNCell是 LSTMCells,则state将为每层cell的LSTMStateTuple的元组Tuple(LSTMStateTuple,LSTMStateTuple,LSTMStateTuple)。

一个真实的例子

实现一个动态的RNN

简单例子:

# create 2 LSTMCells
rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [128, 256]]

# create a RNN cell composed sequentially of a number of RNNCells
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)

# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
                                   inputs=data,
                                   dtype=tf.float32)
                                   
# create a BasicRNNCell
rnn_cell = tf.nn.rnn_cell.BasicRNNCell(hidden_size)

# 'outputs' is a tensor of shape [batch_size, max_time, cell_state_size]

# defining initial state
initial_state = rnn_cell.zero_state(batch_size, dtype=tf.float32)

# 'state' is a tensor of shape [batch_size, cell_state_size]
outputs, state = tf.nn.dynamic_rnn(rnn_cell, input_data,
                                   initial_state=initial_state,
                                   dtype=tf.float32)

猜你喜欢

转载自blog.csdn.net/qq_32806793/article/details/85322672