TensorFlow2.0 Tutorial - Custom training combat (non tf.keras)

TensorFlow2.0 Tutorial - Custom training combat (non tf.keras)

This tutorial we will use to achieve TensorFlow iris classification. The whole process includes: building models, model training, model predictions. Wherein, while still using the network layer keras.layer network, but does not use training process keras methods, but the use of the automatic configuration method of derivation of eager tensorflow2 modes.

Wen is willing to address: https://doit-space.blog.csdn.net/article/details/95041068

The most complete Tensorflow 2.0 Getting Started tutorial continuously updated: https://blog.csdn.net/qq_31456593/article/details/88606284

See the complete tensorflow2.0 tutorial code https://github.com/czy36mengfei/tensorflow2_tutorials_chinese (welcome star)

This tutorial focuses on learning by individuals reproduce notes tensorflow2.0 official tutorial from finishing, Chinese to explain, easy to enjoy reading tutorials Chinese friends, the official tutorial: https://www.tensorflow.org

Import-related library

And introducing other desired TensorFlow Python module. By default, TensorFlow2 use eager to execute the program, it will return results immediately.

from __future__ import absolute_import, division, print_function, unicode_literals
import os
import matplotlib.pyplot as plt
import tensorflow as tf
print('tf version:', tf.__version__)
print('eager execution:', tf.executing_eagerly())
tf version: 2.0.0-alpha0
eager execution: True

Iris classification problem

Imagine you are a botanist, he is looking for an automated method to classify each iris you find. Machine learning algorithms to provide a number of statistical classification of flowers. For example, sophisticated machine learning program can be classified based on photographs of flowers. And here, we will be classified according to the length and width measurements iris sepals and petals.

Iris There are more than 300 kinds of categories, but we are here mainly for the following three classification:

  • Iris silky
  • Iris virginica
  • Iris versicolor
    # [Image dump outer link failure (img-YiS1cGTo-1562512656938) (https://www.tensorflow.org/images/iris_three_species.jpg)]

Fortunately, someone has sepals and petals with a measure created 120 iris data set. This is a popular beginner classical machine learning classification data sets.

Download the data set
using tf.keras.utils.get_file function to download training data set files. This returns the file path to download the file.

train_dataset_url = "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv"
train_dataset_fp = tf.keras.utils.get_file(fname=os.path.basename(train_dataset_url),
                                          origin=train_dataset_url)
print('下载数据至:', train_dataset_fp)
下载数据至: /root/.keras/datasets/iris_training.csv

Check the data

This dataset iris_training.csv is a plain text file, a data table for storing a comma separated value format (CSV) of. Use head -n5 command takes a peak value in the preceding five entries:

!head -n5 {train_dataset_fp}
120,4,setosa,versicolor,virginica
6.4,2.8,5.6,2.2,2
5.0,2.3,3.3,1.0,1
4.9,2.5,4.5,1.7,2
4.9,3.1,1.5,0.1,0

From this view of the data set, please note the following:

The first row is the header contains information about the data sets:
a total of 120 examples. Each example has four features, and one of three possible tag name.
Follow-up data recording, one example of each row, wherein:
the first four fields are wherein: These are features of the examples. Here, flower field contains floating point numbers representing the measured value.
The last one is the label: This is the value we want to predict. For this data set, it is associated with flower name corresponding to the integer value 0, 1 or 2.

column_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
# 获取特征和标签名
feature_name = column_names[:-1]
label_name = column_names[-1]

Each label is associated with the string name (e.g., "setosa"), but generally depends on machine learning value. Digital mapping using the tag to category, such as:

  • 0: silky Iris:
  • 1:Iris versicolor
  • 2:Iris virginica
class_names = ['Iris setosa', 'Iris versicolor', 'Iris virginica']

Create a tf.data.Dataset

TensorFlow API handles many data sets to load data into a common case model. This is a high-level API, used to read the data and converts it into data types used for training.

Since the data set is a text file CSV format, it is necessary to use make_csv_dataset function parses the data into the appropriate format. Because this function to generate data for a training model, the default behavior is data (shuffle = True, shuffle_buffer_size = 10000) shuffled, and always replicate data set (num_epochs = None). Batch_size also need to set the parameters.

batch_size=32
train_dataset = tf.data.experimental.make_csv_dataset(
    train_dataset_fp,
    batch_size,
    column_names=column_names,
    label_name=label_name,
    num_epochs=1
)

This function returns tf.data.Dataset of make_csv_dataset (features, label) pairs, which is a dictionary features: { 'feature_name': value}

These Dataset object is iterative.

features, labels = next(iter(train_dataset))
print(features)
OrderedDict([('sepal_length', <tf.Tensor: id=64, shape=(32,), dtype=float32, numpy=
array([7.6, 6.9, 7.2, 5. , 6.7, 4.8, 5.4, 5.1, 7.7, 6. , 6.3, 7.4, 5.2,
       7.2, 6.7, 6.1, 5. , 4.9, 6.2, 4.5, 6.6, 6. , 5.5, 6.3, 4.8, 6.7,
       6.1, 5.6, 7.3, 6.9, 5.7, 6.3], dtype=float32)>), ('sepal_width', <tf.Tensor: id=65, shape=(32,), dtype=float32, numpy=
array([3. , 3.2, 3.6, 2.3, 3. , 3. , 3.9, 3.7, 3. , 2.2, 2.3, 2.8, 2.7,
       3.2, 3.1, 2.8, 3.4, 3.1, 2.8, 2.3, 3. , 3. , 3.5, 3.3, 3.4, 3. ,
       2.8, 2.9, 2.9, 3.1, 3.8, 2.5], dtype=float32)>), ('petal_length', <tf.Tensor: id=62, shape=(32,), dtype=float32, numpy=
array([6.6, 5.7, 6.1, 3.3, 5.2, 1.4, 1.3, 1.5, 6.1, 5. , 4.4, 6.1, 3.9,
       6. , 5.6, 4. , 1.6, 1.5, 4.8, 1.3, 4.4, 4.8, 1.3, 6. , 1.6, 5. ,
       4.7, 3.6, 6.3, 4.9, 1.7, 5. ], dtype=float32)>), ('petal_width', <tf.Tensor: id=63, shape=(32,), dtype=float32, numpy=
array([2.1, 2.3, 2.5, 1. , 2.3, 0.3, 0.4, 0.4, 2.3, 1.5, 1.3, 1.9, 1.4,
       1.8, 2.4, 1.3, 0.4, 0.1, 1.8, 0.3, 1.4, 1.8, 0.2, 2.5, 0.2, 1.7,
       1.2, 1.3, 1.8, 1.5, 0.3, 1.9], dtype=float32)>)])

The same features are on the same array, the array size batch_size dimension.
It can be visualized as shown:

plt.scatter(features['petal_length'],
            features['sepal_length'],
            c=labels,
            cmap='viridis')

plt.xlabel("Petal length")
plt.ylabel("Sepal length")
plt.show()

[Image dump outer link failure (img-aKWrIZUf-1562512656939) (output_20_0.png)]

Generally, we will feature the same of different data on the same array, we use tf.pack () is used to reconstruct the features (batch_size, num_features) shape.

def pack_features_vector(features, labels):
    features = tf.stack(list(features.values()), axis=1)
    return features, labels
# 使用tf.data.Dataset.map将重构函数运用到每条数据中。
train_dataset = train_dataset.map(pack_features_vector)
# 查看前5个数据
features, labels = next(iter(train_dataset))
print(features[:5])
tf.Tensor(
[[7.6 3.  6.6 2.1]
 [6.9 3.2 5.7 2.3]
 [7.2 3.6 6.1 2.5]
 [5.  2.3 3.3 1. ]
 [6.7 3.  5.2 2.3]], shape=(5, 4), dtype=float32)

Select Model

A model of the relationship between the functions and labels. For Iris classification problem, the model defines the relationship between the sepals and petals measurement and prediction of iris species. Some simple algebraic model can be described in a few lines, but sophisticated machine learning model has many parameters are difficult to generalize.

Can we determine the relationship between the four features of the iris species without the use of machine learning? In other words, you can use traditional programming techniques (for example, a lot of conditional statements) to create a model of it? Maybe - if you have been long enough analysis of the data set to determine the relationship between the petals and sepals measurements for specific species. This is more complex data sets is difficult - perhaps impossible. Good machine learning methods to determine the model for us. If we have enough to provide a representative example of the correct machine learning model type, the program will we find out the relationship.

Select a specific model

We need to select the model to be trained. There are many types of models and pick a good experience. This tutorial uses neural networks to solve the iris I classification. Neural networks can be found in the complex relationship between features and labels. It is a highly structured FIG, organized into one or more hidden layers. Each layer is composed of one or more hidden neurons. There are several types of neural networks, the program uses dense or fully connected neural networks: one layer of neurons receiving input from each of the neural network neurons connecting layer. For example, Figure 2 illustrates a dense neural network input layer, two hidden layers and an output layer consisting of:

[Image dump outer link failure (img-gbcLzJWi-1562512656940) ()]
Figure 2 having a characteristic, and a hidden layer neural network prediction.

当对来自图2的模型进行训练并喂入未标记的示例时,它产生三个预测:该花是给定的鸢尾花物种的可能性。这种预测称为推理。对于此示例,输出预测的总和为1.0。在图2中,该预测分解为:0.02对山鸢尾,0.95对于变色鸢尾,并0.03为锦葵鸢尾。这意味着模型以95%的概率预测 - 未标记的示例花是变色鸢尾。

使用Keras创建模型

TensorFlow tf.keras API是创建模型和图层的首选方式。这使得构建模型和实验变得容易,而Keras处理将所有内容连接在一起的复杂性。

该tf.keras.Sequential模型是层的线性堆栈。它的构造函数采用一个层实例列表,在这种情况下,两个Dense层各有10个节点,一个输出层有3个节点代表我们的标签预测。第一层的input_shape参数对应于数据集中的要素数,并且是必需的。

# 构建线性模型
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, activation='relu',input_shape=(4,)),
    tf.keras.layers.Dense(10, activation='relu'),
    tf.keras.layers.Dense(3)
])

激活函数确定在层中的每个节点的输出形状。这些非线性很重要 - 没有它们,模型将等同于单个层。有许多可用的激活,但ReLU对于隐藏层是常见的。

隐藏层和神经元的理想数量取决于问题和数据集。像机器学习的许多方面一样,选择神经网络的最佳形状需要知识和实验经验。根据经验,增加隐藏层和神经元的数量通常会创建一个更强大的模型,这需要更多的数据来有效地训练。

测试模型结构

prediction = model(features)
prediction[:5]
<tf.Tensor: id=229, shape=(5, 3), dtype=float32, numpy=
array([[1.6543204 , 0.12405288, 0.24490094],
       [1.4488522 , 0.11291474, 0.24872684],
       [1.5161525 , 0.11867774, 0.28899187],
       [0.86002606, 0.05858952, 0.06260413],
       [1.3767202 , 0.10884094, 0.21688706]], dtype=float32)>

多分类任务需要使用softmax进行归一化

tf.nn.softmax(prediction)[:5]
<tf.Tensor: id=235, shape=(5, 3), dtype=float32, numpy=
array([[0.6845738 , 0.148195  , 0.16723117],
       [0.63935834, 0.16809471, 0.19254689],
       [0.6492055 , 0.16049689, 0.1902975 ],
       [0.52654505, 0.23625231, 0.23720267],
       [0.62697244, 0.1764475 , 0.19658   ]], dtype=float32)>

使用tf.argmax获取概率最大的类标签

print('prediction:', tf.argmax(prediction, axis=1))
print('label:', labels)
prediction: tf.Tensor([0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0], shape=(32,), dtype=int64)
label: tf.Tensor([2 2 2 1 2 0 0 0 2 2 1 2 1 2 2 1 0 0 2 0 1 2 0 2 0 1 1 1 2 1 0 2], shape=(32,), dtype=int32)

训练模型

训练是机器学习中模型从数据集中学习知识并优化自身能力的过程。

Iris分类问题是一个典型的监督学习问题,其通过包含标签的数据集进行学习。而无监督学习则是仅从特征中去寻找相应的模式。

训练和评估的过程都需要计算模型的损失,它可以衡量预测与正确标签的差距,训练过程都是要最小化损失。

我们后面将直接使用tf.keras里面包装好的损失函数来计算损失。

# 损失函数
loss_object=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# 获取损失
def loss(model, x, y):
    y_ = model(x)
    return loss_object(y_true=y, y_pred=y_)
l = loss(model, features, labels)
print(l)
tf.Tensor(1.3738844, shape=(), dtype=float32)

使用tf.GradientTape计算loss对所有变量的梯度。

def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        loss_value = loss(model, inputs, targets)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)

创建优化器

优化程序将计算出的梯度应用于模型的变量,以最大限度地减少损失函数。 您可以将损失函数视为曲面(参见图3),我们希望通过四处走动找到它的最低点。 渐变指向最陡的上升方向 - 所以我们将以相反的方向行进并向下移动。 通过迭代计算每批的损失和梯度,我们将在训练期间调整模型。 逐渐地,该模型将找到权重和偏差的最佳组合,以最小化损失。 损失越低,模型的预测越好。

[Image dump outer link failure (img-GoScKPWF-1562512656940) (https://cs231n.github.io/assets/nn3/opt1.gif)]

TensorFlow有许多可用于训练的优化算法。 该模型使用tf.train.GradientDescentOptimizer实现随机梯度下降(SGD)算法。 learning_rate设置每次迭代下一步的步长。 这是一个超参数,您通常会调整以获得更好的结果。

optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)

优化器使用如下

loss_value, grads = grad(model, features, labels)
print('步数:{}, 初始loss值:{}'.format(optimizer.iterations.numpy(),
                                loss_value.numpy()))
optimizer.apply_gradients(zip(grads, model.trainable_variables))
print('步数:{}, loss值:{}'.format(optimizer.iterations.numpy(),
                                loss(model,features, labels).numpy()))
步数:0, 初始loss值:1.3738844394683838
步数:1, loss值:1.1648454666137695

训练循环

每个epoch数据将会被训练一次。

# 保存loss和acc
train_loss_results=[]
train_accuracy_results=[]

num_epochs =201
for epoch in range(num_epochs):
    # 用于记录loss和acc的类
    epoch_loss_avg = tf.keras.metrics.Mean()
    epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
    
    # 训练循环
    for x, y in train_dataset:
        # 获取loss和梯度
        loss_value, grads = grad(model, x, y)
        # 梯度优化
        optimizer.apply_gradients(zip(grads, model.trainable_variables))
        
        # 记录loss均值
        epoch_loss_avg(loss_value)
        # 记录准确率
        epoch_accuracy(y, model(x))

    # 保存每个epoch的loss和acc
    train_loss_results.append(epoch_loss_avg.result())
    train_accuracy_results.append(epoch_accuracy.result())

    if epoch % 50 == 0:
        print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch,
                                                                    epoch_loss_avg.result(),
                                                                    epoch_accuracy.result()))

Epoch 000: Loss: 1.048, Accuracy: 70.000%
Epoch 050: Loss: 0.074, Accuracy: 99.167%
Epoch 100: Loss: 0.059, Accuracy: 99.167%
Epoch 150: Loss: 0.054, Accuracy: 99.167%
Epoch 200: Loss: 0.051, Accuracy: 99.167%

可视化训练过程

fig, axes = plt.subplots(2, sharex=True, figsize=(12, 8))
fig.suptitle('Training Metrics')

axes[0].set_ylabel("Loss", fontsize=14)
axes[0].plot(train_loss_results)

axes[1].set_ylabel("Accuracy", fontsize=14)
axes[1].set_xlabel("Epoch", fontsize=14)
axes[1].plot(train_accuracy_results)
plt.show()

[Image dump outer link failure (img-iCzXXjXv-1562512656941) (output_46_0.png)]

评估模型

评估模型类似于训练模型。 最大的区别是示例来自单独的测试集而不是训练集。 为了公平地评估模型的有效性,用于评估模型的示例必须与用于训练模型的示例不同。

测试数据集的设置类似于训练数据集的设置。 下载CSV文本文件并解析该值,然后将其打乱:

test_url = "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv"

test_fp = tf.keras.utils.get_file(fname=os.path.basename(test_url),
                                  origin=test_url)
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv
8192/573 [============================================================================================================================================================================================================================================================================================================================================================================================================================================] - 0s 0us/step
test_dataset = tf.data.experimental.make_csv_dataset(
    test_fp,
    batch_size,
    column_names=column_names,
    label_name='species',
    num_epochs=1,
    shuffle=False)

test_dataset = test_dataset.map(pack_features_vector)

评估测试数据集上的模型

与训练阶段不同,该模型仅评估测试数据的单个时期。 在下面的代码单元格中,我们迭代测试集中的每个示例,并将模型的预测与实际标签进行比较。 这用于测量整个测试集中模型的准确性。

# 准确率统计类
test_accuracy = tf.keras.metrics.Accuracy()

for (x,y) in test_dataset:
    logits = model(x)
    prediction = tf.argmax(logits, axis=1, output_type=tf.int32)
    test_accuracy(prediction, y) 

print('测试集准确率:', test_accuracy.result())
测试集准确率: tf.Tensor(0.96666664, shape=(), dtype=float32)

结果对比

tf.stack([y, prediction], axis=1)
<tf.Tensor: id=164737, shape=(30, 2), dtype=int32, numpy=
array([[1, 1],
       [2, 2],
       [0, 0],
       [1, 1],
       [1, 1],
       [1, 1],
       [0, 0],
       [2, 1],
       [1, 1],
       [2, 2],
       [2, 2],
       [0, 0],
       [2, 2],
       [1, 1],
       [1, 1],
       [0, 0],
       [1, 1],
       [0, 0],
       [0, 0],
       [2, 2],
       [0, 0],
       [1, 1],
       [2, 2],
       [1, 1],
       [1, 1],
       [1, 1],
       [0, 0],
       [1, 1],
       [2, 2],
       [1, 1]], dtype=int32)>

##使用训练的模型进行预测

我们已经训练了一个模型并且“证明”它对Iris物种进行分类是好的 - 但不是完美的。 现在让我们使用训练有素的模型对未标记的例子做出一些预测; 也就是说,包含特征但不包含标签的示例。

在现实生活中,未标记的示例可能来自许多不同的来源,包括应用程序,CSV文件和数据源。 目前,我们将手动提供三个未标记的示例来预测其标签。 回想一下,标签号被映射到命名表示,如下所示:

  • 0: Iris setosa
  • 1: Iris versicolor
  • 2: Iris virginica
predict_dataset = tf.convert_to_tensor([
    [5.1, 3.3, 1.7, 0.5,],
    [5.9, 3.0, 4.2, 1.5,],
    [6.9, 3.1, 5.4, 2.1]
])

predictions = model(predict_dataset)

for i, logits in enumerate(predictions):
  class_idx = tf.argmax(logits).numpy()
  p = tf.nn.softmax(logits)[class_idx]
  name = class_names[class_idx]
  print("Example {} prediction: {} ({:4.1f}%)".format(i, name, 100*p))
Example 0 prediction: Iris setosa (99.9%)
Example 1 prediction: Iris versicolor (99.9%)
Example 2 prediction: Iris virginica (99.1%)
Published 143 original articles · won praise 345 · views 470 000 +

Guess you like

Origin blog.csdn.net/qq_31456593/article/details/95041068