Tensorflow2.0官方文档的自学笔记（2）

本文链接： https://blog.csdn.net/weixin_42072754/article/details/102696964

下面来探究一下Sequential有什么东西

首先，用Sequential创建一个空模型

model = tf.keras.Sequential()

然后可以用model.add()方法添加各种layer层，
但是layer层有很多种，我都不认识，比如下面这是官方给的例子

# Optionally, the first layer can receive an `input_shape` argument:
# (随意地，第一层可以接收一个input_shape 参数？？)
model = Sequential()
model.add(Dense(32, input_shape=(500,)))
# Afterwards, we do automatic shape inference:
# (以后，我们做自动形状推断？？)
model.add(Dense(32))

？？？？完全看不懂啥意思，啥是Dense，啥自动推断shape，啥是input_shape？
那么，按顺序解决

1，啥是Dense？

https://tensorflow.google.cn/api_docs/python/tf/compat/v1/layers/Dense?hl=en
Class Dense
Densely-connected layer class.（全连接层）

This layer implements the operation: outputs = activation(inputs * kernel + bias) Where activation is the activation function passed as the activation argument (if not None), kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer (only if use_bias is True).

Arguments:
units: Integer or Long, dimensionality of the output space.
activation: Activation function (callable). Set it to None to maintain a linear activation.
use_bias: Boolean, whether the layer uses a bias.
kernel_initializer: Initializer function for the weight matrix. If None (default), weights are initialized using the default initializer used by tf.compat.v1.get_variable.
bias_initializer: Initializer function for the bias.
kernel_regularizer: Regularizer function for the weight matrix.
bias_regularizer: Regularizer function for the bias.
activity_regularizer: Regularizer function for the output.
kernel_constraint: An optional projection function to be applied to the kernel after being updated by an Optimizer (e.g. used to implement norm constraints or value constraints for layer weights). The function must take as input the unprojected variable and must return the projected variable (which must have the same shape). Constraints are not safe to use when doing asynchronous distributed training.
bias_constraint: An optional projection function to be applied to the bias after being updated by an Optimizer.
trainable: Boolean, if True also add variables to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
name: String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
_reuse: Boolean, whether to reuse the weights of a previous layer by the same name.

Properties:
units: Python integer, dimensionality of the output space.
activation: Activation function (callable).
use_bias: Boolean, whether the layer uses a bias.
kernel_initializer: Initializer instance (or name) for the kernel matrix.
bias_initializer: Initializer instance (or name) for the bias.
kernel_regularizer: Regularizer instance for the kernel matrix (callable)
bias_regularizer: Regularizer instance for the bias (callable).
activity_regularizer: Regularizer instance for the output (callable)
kernel_constraint: Constraint function for the kernel matrix.
bias_constraint: Constraint function for the bias.
kernel: Weight matrix (TensorFlow variable or tensor).
bias: Bias vector, if applicable (TensorFlow variable or tensor).

__init__(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True,
    name=None,
    **kwargs
)

2，什么是自动推断shape？

自动推断shape是keras的特性，方便了Sequential构建模型，但是原理是啥不知道，
百度也搜不到相关信息。

3，啥是input_shape？

shape是张量的形状。
input_shape就是输入的张量的形状
上面的自动推断shape应该就是Keras会自动计算这一层输出的张量的形状，然后传递给下一层。
不需要手动设置。

shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; ‘None’ elements represent dimensions where the shape is not known.

到这里上面三个问题有了大概的了解，然而在了解过程中又引出了更多问题：

什么是keras的layers？

layers就是神经网络的一层，keras有很多种类的层可以选择。
https://tensorflow.google.cn/api_docs/python/tf/keras/layers?hl=en
我觉得到用的时候再看不迟。不然太多了记不住。

回到文档开头，Sequential的内容

Class Sequential
Linear stack of layers.
（就像做汉堡一样堆叠）

下面是官方文档的example代码：

# Optionally, the first layer can receive an `input_shape` argument:
model = Sequential()
model.add(Dense(32, input_shape=(500,)))
# Afterwards, we do automatic shape inference:
model.add(Dense(32))

# This is identical to the following:
model = Sequential()
model.add(Dense(32, input_dim=500))

# And to the following:
model = Sequential()
model.add(Dense(32, batch_input_shape=(None, 500)))

# Note that you can also omit the `input_shape` argument:
# omit 删除，忽略的意思
# In that case the model gets built the first time you call `fit` (or other
# training and evaluation methods).
# 那样的话，模型会在你第一次运行fit或者其它的训练或评估方法时，才built
model = Sequential()
model.add(Dense(32))
model.add(Dense(32))
# 这里，model堆叠了俩全连接层
model.compile(optimizer=optimizer, loss=loss)
# 然后model.compile？？这是啥？
# This builds the model for the first time:
# model.fit是啥？
model.fit(x, y, batch_size=32, epochs=10)

# Note that when using this delayed-build pattern (no input shape specified),
# the model doesn't have any weights until the first call
# to a training/evaluation method (since it isn't yet built):
model = Sequential()
model.add(Dense(32))
model.add(Dense(32))
# model.weights是啥？
model.weights  # returns []

# Whereas if you specify the input shape, the model gets built continuously
# whereas 然而，但是，尽管
# as you are adding layers:
model = Sequential()
model.add(Dense(32, input_shape=(500,)))
model.add(Dense(32))
model.weights  # returns list of length 4

# When using the delayed-build pattern (no input shape specified), you can
# choose to manually build your model by calling `build(batch_input_shape)`:
# 当你选择延迟构建时，可以使用model.build手动构建模型
model = Sequential()
model.add(Dense(32))
model.add(Dense(32))
model.build((None, 500))
model.weights  # returns list of length 4

阅读了上述代码之后，发现有几个不会的点，记录下来

model.compile
model.fit
model.weights
model.build

接着Sequential，等会儿再研究上面4个函数的细节。

它的构造函数长这样，我发现它还能起名字（name属性），而且好像可以直接往里放层的集合，列表啥的

__init__(
    layers=None,
    name=None
)

Properties：

layers
metrics_names
Returns the model’s display labels for all outputs.

run_eagerly
Settable attribute indicating whether the model should run eagerly.

Running eagerly means that your model will be run step by step, like Python code. Your model might run slower, but it should become easier for you to debug it by stepping into individual layer calls.

By default, we will attempt to compile your model to a static graph to deliver the best execution performance.

Returns:
Boolean, whether the model should run eagerly.

sample_weights
state_updates
Returns the updates from all layers that are stateful.

This is useful for separating training updates and state updates, e.g. when we need to update a layer’s internal state during prediction.

Returns:
A list of update ops.

stateful

Methods

它的方法有很多，如果顺序学习，可能会得到一些不常用的信息，从而降低效率。
上面有4个主要的疑问，恰好都是函数的，所以先把那4个学会，提高效率。

model.compile

Configures the model for training.

compile(
    optimizer='rmsprop',
    # rmsprop是啥玩意？是个优化器，详细的后边再说
    loss=None,
    metrics=None,
    loss_weights=None,
    sample_weight_mode=None,
    weighted_metrics=None,
    target_tensors=None,
    distribute=None,
    **kwargs
    # kwargs是什么东西？是键值对参数，详细的后边再说
)

Arguments:
optimizer: String (name of optimizer) or optimizer instance. See tf.keras.optimizers.

loss: String (name of objective function), objective function or tf.losses.Loss instance. See tf.losses. If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses.

metrics: List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[‘accuracy’]. To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={‘output_a’: ‘accuracy’, ‘output_b’: [‘accuracy’, ‘mse’]}. You can also pass a list (len = len(outputs)) of lists of metrics such as metrics=[[‘accuracy’], [‘accuracy’, ‘mse’]] or metrics=[‘accuracy’, [‘accuracy’, ‘mse’]].

loss_weights: Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model’s outputs. If a tensor, it is expected to map output names (strings) to scalar coefficients.

sample_weight_mode: If you need to do timestep-wise sample weighting (2D weights), set this to “temporal”. None defaults to sample-wise weights (1D). If the model has multiple outputs, you can use a different sample_weight_mode on each output by passing a dictionary or a list of modes.
weighted_metrics: List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.

target_tensors: By default, Keras will create placeholders for the model’s target, which will be fed with the target data during training. If instead you would like to use your own target tensors (in turn, Keras will not expect external Numpy data for these targets at training time), you can specify them via the target_tensors argument. It can be a single tensor (for a single-output model), a list of tensors, or a dict mapping output names to target tensors.

distribute: NOT SUPPORTED IN TF 2.0, please create and compile the model under distribution strategy scope instead of passing it to compile.

**kwargs: Any additional arguments.

Raises（抛出的异常）:
ValueError: In case of invalid arguments for optimizer, loss, metrics or sample_weight_mode.

model.fit

Trains the model for a fixed number of epochs (iterations on a dataset).

fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose=1,
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_freq=1,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
    **kwargs
)

Arguments:

x: Input data. It could be:
A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs).
A dict mapping input names to the corresponding array/tensors, if the model has named inputs.
A tf.data dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).
A generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample weights).

y: Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with x (you cannot have Numpy inputs and tensor targets, or inversely). If x is a dataset, generator, or keras.utils.Sequence instance, y should not be specified (since targets will be obtained from x).

batch_size: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of symbolic tensors, datasets, generators, or keras.utils.Sequence instances (since they generate batches).

epochs: Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided. Note that in conjunction with initial_epoch, epochs is to be understood as “final epoch”. The model is not trained for a number of iterations given by epochs, but merely until the epoch of index epochs is reached.

verbose: 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. Note that the progress bar is not particularly useful when logged to a file, so verbose=2 is recommended when not running interactively (eg, in a production environment).

callbacks: List of keras.callbacks.Callback instances. List of callbacks to apply during training. See tf.keras.callbacks.

validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. This argument is not supported when x is a dataset, generator or keras.utils.Sequence instance.

validation_data: Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split. validation_data could be:
tuple (x_val, y_val) of Numpy arrays or tensors
tuple (x_val, y_val, val_sample_weights) of Numpy arrays
dataset For the first two cases, batch_size must be provided. For the last case, validation_steps must be provided.

shuffle: Boolean (whether to shuffle the training data before each epoch) or str (for ‘batch’). ‘batch’ is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None.

class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to “pay more attention” to samples from an under-represented class.

sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=“temporal” in compile(). This argument is not supported when x is a dataset, generator, or keras.utils.Sequence instance, instead provide the sample_weights as the third element of x.

initial_epoch: Integer. Epoch at which to start training (useful for resuming a previous training run).

steps_per_epoch: Integer or None. Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. If x is a tf.data dataset, and ‘steps_per_epoch’ is None, the epoch will run until the input dataset is exhausted. This argument is not supported with array inputs.

validation_steps: Only relevant if validation_data is provided and is a tf.data dataset. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch. If validation_data is a tf.data dataset and ‘validation_steps’ is None, validation will run until the validation_data dataset is exhausted.

validation_freq: Only relevant if validation data is provided. Integer or collections_abc.Container instance (e.g. list, tuple, etc.). If an integer, specifies how many training epochs to run before a new validation run is performed, e.g. validation_freq=2 runs validation every 2 epochs. If a Container, specifies the epochs on which to run validation, e.g. validation_freq=[1, 2, 10] runs validation at the end of the 1st, 2nd, and 10th epochs.

max_queue_size: Integer. Used for generator or keras.utils.Sequence input only. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10.

workers: Integer. Used for generator or keras.utils.Sequence input only. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread.

use_multiprocessing: Boolean. Used for generator or keras.utils.Sequence input only. If True, use process-based threading. If unspecified, use_multiprocessing will default to False. Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can’t be passed easily to children processes.

**kwargs: Used for backwards compatibility.

Returns:
A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable).

Raises:
RuntimeError: If the model was never compiled.
ValueError: In case of mismatch between the provided input data and what the model expects.

model.weights

搜不到相关内容，可能是代表某个步骤而不是具体的函数。

model.build

搜不到相关内容，可能是代表某个步骤而不是具体的函数。

总结：

学习就像细节的不断探索。
经过这一篇的深入学习，如果我再去看自学笔记（1），一定会有新的感悟，这叫做复习，温故而知新。