tf.keras入门系列(三)

构建简单网络,训练并测试模型

inputs = tf.keras.Input(shape=(784,), name='img')
h1 = layers.Dense(32, activation='relu')(inputs)
h2 = layers.Dense(32, activation='relu')(h1)
outputs = layers.Dense(10, activation='softmax')(h2)
model = tf.keras.Model(inputs=inputs, outputs=outputs, name='try')
#手写体数据集
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') /255
x_test = x_test.reshape(10000, 784).astype('float32') /255
model.compile(optimizer=tf.keras.optimizers.RMSprop(),
             loss=tf.keras.losses.sparse_categorical_crossentropy, 
             metrics=['accuracy'])
history = model.fit(x_train, y_train, batch_size=64, epochs=5, validation_split=0.2)
test_scores = model.evaluate(x_test, y_test, verbose=0)

其中,verbose默认为1,显示训练的进程,verbose=0表示沉默不显示。

模型序列化和反序列化

model.save('model_save.h5')
del model
model = tf.keras.models.load_model('model_save.h5')

利用共享网络创建多个模型

在将这部分之前,先说说用到的相关API函数。
tf.keras.layers.Conv2D()表示二维的卷积,卷积的详细过程这里不在描述的,属于基本功。当使用卷积层作为模型的第一层时,需要指定一个关键参数-input_shape.
tf.keras.lay ers.Conv2D()的参数如下:

  • filters 卷积核的数目,表示输出空间的通道数。
  • kernel_size 卷积核的尺寸,通常应该是一个tuple或者list,比如对于二维来说,可以是(3,4),表示3x4的宽和高,也可以是单个整数,那么每个维度都是相同的值,比如2,表示2x2
  • strides 卷积的步长,跟卷积核的尺寸对应,可以是tuple,也可以是单个整数,默认为(1, 1)
  • padding 填充类型,默认值为’valid’
  • data_format 字符串值,channels_last(默认值)或channels_first。输入维度的顺序。channels_last对应形状(batch, height, width, channels)的输入,而channels_first对应形状(batch, channels, height, width)的输入。默认为图片数据格式的值。可以在~/. Keras / Keras .json的Keras配置文件中找到,如果从未设置它,那么它将是“channels_last”。
  • dilation_rate 这个参数用于膨胀卷积(IDCNN),表示膨胀率。可以是整数元组或者单个整数,并且值得指出的是: dilation_rate!=1和stride!=1是两回事,不可混为一谈。dilation_rate默认为(1,1)
  • activation 激活函数
  • use_bias是否使用偏置项,默认为True
  • kernel_initializer 卷积核权重初始化,默认为“glorot_uniform”
  • bias_initializer 偏置项初始化,默认都为0
  • kernel_regularizer 卷积核权重的正则化函数
  • bias_regularizer 偏置项权重的正则
  • activity_regularizer激活函数的正则
  • kernel_constraint 卷积核权重矩阵的限制函数
  • bias_constraint 偏置项的限制函数

对于input_shape和output_shape,官网是这样描述的:

  • input_shape:4D tensor with shape: (samples, channels, rows, cols) if data_format=‘channels_first’ or 4D tensor with shape: (samples, rows, cols, channels) if data_format=‘channels_last’.
  • output_shape:4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format=‘channels_first’ or 4D tensor with shape: (samples, new_rows, new_cols, filters) if data_format=‘channels_last’. rows and cols values might have changed due to padding.

这里有必要解释一下tf.keras.layers.MaxPool2D和tf.keras.layers.GlobalMaxPool2D区别。
MaxPool2D初始化参数有:

__init__(
    pool_size=(2, 2),
    strides=None,
    padding='valid',
    data_format=None,
    **kwargs
)

GlobalMaxPool2D初始化参数有:

__init__(
    data_format=None,
    **kwargs
)

MaxPool2D是普通的池化,其不改变输入矩阵的维度,而GlobalMaxPool2D是全局池化,改变了输入矩阵的维度。举例如下:当data_format='channels_last’时,假设输入都是一个四维tensor:(batch_size, rows, cols, channels). 比较一下二者的输出。
MaxPool2D是这样的:

  • 4D tensor with shape (batch_size, pooled_rows, pooled_cols, channels).

GlobalMaxPool2D却是这样的:

  • 2D tensor with shape (batch_size, channels)

layers.Conv2DTranspose()表示反卷积,正常卷积的反向过程。layers.UpSampling2D()表示过采样,行和列的采样由各自的采样因子决定。UpSampling2D可以看作是Pooling的逆向操作,采用Nearest Neighbor interpolation扩增数据,进而增大feature map的大小。
到这里相关API讲完了,下面利用共享网络搭建多个模型。

#编码和解码
encode_input = keras.Input(shape=(28,28,1), name='img')
h1 = layers.Conv2D(16, 3, activation='relu')(encode_input)
h1 = layers.Conv2D(32, 3, activation='relu')(h1)
h1 = layers.MaxPool2D(3)(h1)
h1 = layers.Conv2D(32, 3, activation='relu')(h1)
h1 = layers.Conv2D(16, 3, activation='relu')(h1)
encode_output = layers.GlobalMaxPool2D()(h1)
encode_model = keras.Model(inputs=encode_input, outputs=encode_output, name='encoder')
encode_model.summary()

h2 = layers.Reshape((4, 4, 1))(encode_output)
h2 = layers.Conv2DTranspose(16, 3, activation='relu')(h2)
h2 = layers.Conv2DTranspose(32, 3, activation='relu')(h2)
h2 = layers.UpSampling2D(3)(h2)
h2 = layers.Conv2DTranspose(16, 3, activation='relu')(h2)
decode_output = layers.Conv2DTranspose(1, 3, activation='relu')(h2)
autoencoder = keras.Model(inputs=encode_input, outputs=decode_output, name='autoencoder')
autoencoder.summary()

也可以把模型当作一层网络来使用

encode_input = keras.Input(shape=(28,28,1), name='src_img')
h1 = layers.Conv2D(16, 3, activation='relu')(encode_input)
h1 = layers.Conv2D(32, 3, activation='relu')(h1)
h1 = layers.MaxPool2D(3)(h1)
h1 = layers.Conv2D(32, 3, activation='relu')(h1)
h1 = layers.Conv2D(16, 3, activation='relu')(h1)
encode_output = layers.GlobalMaxPool2D()(h1)

encode_model = keras.Model(inputs=encode_input, outputs=encode_output, name='encoder')
encode_model.summary()

decode_input = keras.Input(shape=(16,), name='encoded_img')
h2 = layers.Reshape((4, 4, 1))(decode_input)
h2 = layers.Conv2DTranspose(16, 3, activation='relu')(h2)
h2 = layers.Conv2DTranspose(32, 3, activation='relu')(h2)
h2 = layers.UpSampling2D(3)(h2)
h2 = layers.Conv2DTransp```ose(16, 3, activation='relu')(h2)
decode_output = layers.Conv2DTranspose(1, 3, activation='relu')(h2)
decode_model = keras.Model(inputs=decode_input, outputs=decode_output, name='decoder')
decode_model.summary()

autoencoder_input = keras.Input(shape=(28,28,1), name='img')
h3 = encode_model(autoencoder_input)
autoencoder_output = decode_model(h3)
autoencoder = keras.Model(inputs=autoencoder_input, outputs=autoencoder_output,
                          name='autoencoder')
autoencoder.summary()

构建多输入输出网络模型

在构建模型之前,先介绍相关API
tf.keras.layers.Embedding()主要用来训练词向量,参数如下所示:

  • input_dim 词汇表的大小
  • output_dim 输出词向量的维度
  • input_length 输入句子的长度

假设输入是一个2维tensor:(batch_size, input_length),那么输出就是一个3维tensor:(batch_size, input_length, output_dim)
tf.keras.layers.LSTM是长短时记忆网络,RNN中的一个循环体,其主要参数如下:

  • units 输出空间的维度
  • activation 表示输出门的激活函数,默认为tanh
  • recurrent_activation 输入门和遗忘门的激活函数,默认为sigmoid
  • return_sequences 是否返回每一个cell的输出,或者只返回最终的输出
  • return_state 是否返回最后一个cell的细胞状态
# 构建一个根据文档内容、标签和标题,预测文档优先级和执行部门的网络
# 超参
num_words = 2000
num_tags = 12
num_departments = 4

# 输入
body_input = keras.Input(shape=(None,), name='body')
title_input = keras.Input(shape=(None,), name='title')
tag_input = keras.Input(shape=(num_tags,), name='tag')

# 嵌入层
body_feat = layers.Embedding(num_words, 64)(body_input)
title_feat = layers.Embedding(num_words, 64)(title_input)

# 特征提取层
body_feat = layers.LSTM(32)(body_feat)
title_feat = layers.LSTM(128)(title_feat)
features = layers.concatenate([title_feat,body_feat, tag_input])

# 分类层
priority_pred = layers.Dense(1, activation='sigmoid', name='priority')(features)
department_pred = layers.Dense(num_departments, activation='softmax', name='department')(features)

# 构建模型
model = keras.Model(inputs=[body_input, title_input, tag_input],
                    outputs=[priority_pred, department_pred])
model.summary()

构造数据,训练模型

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
             loss={'priority': 'binary_crossentropy',
                  'department': 'categorical_crossentropy'},
             loss_weights=[1., 0.2])

import numpy as np
# 载入输入数据
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tag_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')
# 标签
priority_label = np.random.random(size=(1280, 1))
department_label = np.random.randint(2, size=(1280, num_departments))
# 训练
history = model.fit(
    {'title': title_data, 'body':body_data, 'tag':tag_data},
    {'priority':priority_label, 'department':department_label},
    batch_size=32,
    epochs=5
)

共享网络层

share_embedding = layers.Embedding(1000, 64)

input1 = keras.Input(shape=(None,), dtype='int32')
input2 = keras.Input(shape=(None,), dtype='int32')

feat1 = share_embedding(input1)
feat2 = share_embedding(input2)

自定义网络层

# import tensorflow as tf
# import tensorflow.keras as keras
class MyDense(layers.Layer):
    def __init__(self, units=32):
        super(MyDense, self).__init__()
        self.units = units
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='random_normal',
                                 trainable=True)
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
    def get_config(self):
        return {'units': self.units}
    
inputs = keras.Input((4,))
outputs = MyDense(10)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(
config, custom_objects={'MyDense':MyDense}
)
# 在自定义网络层调用其他网络层

# 超参
time_step = 10
batch_size = 32
hidden_dim = 32
inputs_dim = 5

# 网络
class MyRnn(layers.Layer):
    def __init__(self):
        super(MyRnn, self).__init__()
        self.hidden_dim = hidden_dim
        self.projection1 = layers.Dense(units=hidden_dim, activation='relu')
        self.projection2 = layers.Dense(units=hidden_dim, activation='relu')
        self.classifier = layers.Dense(1, activation='sigmoid')
    def call(self, inputs):
        outs = []
        states = tf.zeros(shape=[inputs.shape[0], self.hidden_dim])
        for t in range(inputs.shape[1]):
            x = inputs[:,t,:]
            h = self.projection1(x)
            y = h + self.projection2(states)
            states = y
            outs.append(y)
        # print(outs)
        features = tf.stack(outs, axis=1)
        print(features.shape)
        return self.classifier(features)

# 构建网络
inputs = keras.Input(batch_shape=(batch_size, time_step, inputs_dim))
x = layers.Conv1D(32, 3)(inputs)
print(x.shape)
outputs = MyRnn()(x)
model = keras.Model(inputs, outputs)


rnn_model = MyRnn()
_ = rnn_model(tf.zeros((1, 10, 5)))

本节参考:
大牛博客
官方文档

发布了24 篇原创文章 · 获赞 2 · 访问量 1178

猜你喜欢

转载自blog.csdn.net/qq_40176087/article/details/100906612
今日推荐