代码实践 | 自编码器

前情回顾

戳上方蓝字【阿力阿哩哩的炼丹日常】关注我~

今天继续给大家介绍第四章的内容

前面我们介绍了:

深度学习开端-全连接神经网络

一文掌握CNN卷积神经网络

超参数(Hyperparameters)|  上

超参数(Hyperparameters)|  下

寄己训练寄己|自编码器

通熟易懂RNN|RNN与RNN的变种结构 | 上

通俗易懂LSTM|RNN的变种结构 | LSTM长短期记忆网络

通俗易懂GRU|门控循环单元(gated recurrent unit, GRU)

代码实践 | 全连接神经网络回归---房价预测

代码实践|全连接神经网络与文本分类

代码实践 | CNN卷积神经网络之文本分类

代码实践 | 卷积神经网络之图像分类

4.7

代码实践

4.7.5 自编码器

笔者将在这一节介绍如何利用手写体数据集实现四种自编码器。

代码参考来源:

https://github.com/nathanhubens/Autoencoders

1. 单层自编码器

1.  # /chapter4/4_7_5_AutoEncoder.ipynb
2.  import keras  
3.  import numpy as np  
4.  import matplotlib.pyplot as plt  
5.  from keras.datasets import mnist  
6.  from keras.models import Model  
7.  from keras.layers import Input, add  
8.  from keras.layers.core import Layer, Dense, Dropout, Activation, Flatten, Reshape  
9.  from keras import regularizers  
10.  from keras.regularizers import l2  
11.  from keras.layers.convolutional import Conv2D, MaxPooling2D, UpSampling2D, ZeroPadding2D  
12.  from keras.utils import np_utils

1) 读取手写体数据及与图像预处理

1.  (X_train, _), (X_test, _) = mnist.load_data()  
2.    
3.  #  归一化  
4.  X_train = X_train.astype("float32")/255.  
5.  X_test = X_test.astype("float32")/255.  
6.    
7.  print('X_train shape:', X_train.shape)  
8.  print(X_train.shape[0], 'train samples')  
9.  print(X_test.shape[0], 'test samples')

X_train shape: (60000, 28, 28)

60000 train samples

10000 test samples

1.  # np.prod是将28X28矩阵转化成1X784,方便全连接神经网络输入层784个神经元读取。
2.  X_train = X_train.reshape((len(X_train), np.prod(X_train.shape[1:])))  
3.  X_test = X_test.reshape((len(X_test), np.prod(X_test.shape[1:])))

2) 构建自编码器模型

1.  input_size = 784  
2.  hidden_size = 64  
3.  output_size = 784  
4.    
5.  x = Input(shape=(input_size,))  
6.  h = Dense(hidden_size, activation='relu')(x)  
7.  r = Dense(output_size, activation='sigmoid')(h)  
8.    
9.  autoencoder = Model(inputs=x, outputs=r)  
10.  autoencoder.compile(optimizer='adam', loss='mse')

3) 训练

1.  epochs = 5  
2.  batch_size = 128  
3.    
4.  history = autoencoder.fit(X_train, X_train,   
5.                            batch_size=batch_size,   
6.                            epochs=epochs, verbose=1,   
7.                            validation_data=(X_test, X_test)  
8.                           )

4) 查看自编码器的压缩效果

1.  conv_encoder = Model(x, h)  # 只取编码器做模型  
2.  encoded_imgs = conv_encoder.predict(X_test)  
3.    
4.  # 打印10张测试集手写体的压缩效果  
5.  n = 10  
6.  plt.figure(figsize=(20, 8))  
7.  for i in range(n):  
8.      ax = plt.subplot(1, n, i+1)  
9.      plt.imshow(encoded_imgs[i].reshape(4, 16).T)  
10.      plt.gray()  
11.      ax.get_xaxis().set_visible(False)  
12.      ax.get_yaxis().set_visible(False)  
13.  plt.show()

5) 查看自编码器的解码效果

1.  decoded_imgs = autoencoder.predict(X_test)  
2.  n = 10  
3.  plt.figure(figsize=(20, 6))  
4.  for i in range(n):  
5.      # 打印原图  
6.      ax = plt.subplot(3, n, i+1)  
7.      plt.imshow(X_test[i].reshape(28, 28))  
8.      plt.gray()  
9.      ax.get_xaxis().set_visible(False)  
10.      ax.get_yaxis().set_visible(False)  
11.    
12.        
13.      # 打印解码图  
14.      ax = plt.subplot(3, n, i+n+1)  
15.      plt.imshow(decoded_imgs[i].reshape(28, 28))  
16.      plt.gray()  
17.      ax.get_xaxis().set_visible(False)  
18.      ax.get_yaxis().set_visible(False)  
19.        
20.  plt.show()

6) 训练过程可视化

1.  print(history.history.keys())  
2.    
3.  plt.plot(history.history['loss'])  
4.  plt.plot(history.history['val_loss'])  
5.  plt.title('model loss')  
6.  plt.ylabel('loss')  
7.  plt.xlabel('epoch')  
8.  plt.legend(['train', 'validation'], loc='upper right')  
9.  plt.show()

dict_keys(['val_loss', 'loss'])

2. 多层自编码器

1) 多层自编码器建模

1.  input_size = 784  
2.  hidden_size = 128  
3.  code_size = 64  
4.    
5.  x = Input(shape=(input_size,))  
6.  hidden_1 = Dense(hidden_size, activation='relu')(x)  
7.  h = Dense(code_size, activation='relu')(hidden_1)  
8.  hidden_2 = Dense(hidden_size, activation='relu')(h)  
9.  r = Dense(input_size, activation='sigmoid')(hidden_2)  
10.    
11.  autoencoder = Model(inputs=x, outputs=r)  
12.  autoencoder.compile(optimizer='adam', loss='mse')

2) 训练模型

1.  epochs = 5  
2.  batch_size = 128  
3.    
4.  history = autoencoder.fit(X_train, X_train,   
5.                            batch_size=batch_size,   
6.                            epochs=epochs,   
7.                            verbose=1,   
8.                            validation_data=(X_test, X_test))

3) 查看编码效果

1.  conv_encoder = Model(x, h)  # 只取编码器做模型  
2.  encoded_imgs = conv_encoder.predict(X_test)  
3.    
4.  # 打印10张测试集手写体的压缩效果  
5.  n = 10  
6.  plt.figure(figsize=(20, 8))  
7.  for i in range(n):  
8.      ax = plt.subplot(1, n, i+1)  
9.      plt.imshow(encoded_imgs[i].reshape(4, 16).T)  
10.      plt.gray()  
11.      ax.get_xaxis().set_visible(False)  
12.      ax.get_yaxis().set_visible(False)  
13.  plt.show()

4) 查看解码效果

1.  decoded_imgs = autoencoder.predict(X_test)  
2.    
3.  n = 10  
4.  plt.figure(figsize=(20, 6))  
5.  for i in range(n):  
6.      # 原图  
7.      ax = plt.subplot(3, n, i+1)  
8.      plt.imshow(X_test[i].reshape(28, 28))  
9.      plt.gray()  
10.      ax.get_xaxis().set_visible(False)  
11.      ax.get_yaxis().set_visible(False)  
12.    
13.        
14.      # 解码效果图  
15.      ax = plt.subplot(3, n, i+n+1)  
16.      plt.imshow(decoded_imgs[i].reshape(28, 28))  
17.      plt.gray()  
18.      ax.get_xaxis().set_visible(False)  
19.      ax.get_yaxis().set_visible(False)  
20.        
21.  plt.show()

5) 训练过程可视化

1.  print(history.history.keys())  
2.    
3.  plt.plot(history.history['loss'])  
4.  plt.plot(history.history['val_loss'])  
5.  plt.title('model loss')  
6.  plt.ylabel('loss')  
7.  plt.xlabel('epoch')  
8.  plt.legend(['train', 'validation'], loc='upper right')  
9.  plt.show()

dict_keys(['val_loss', 'loss'])

3. 卷积自编码器

1) 读取数据集

1.  nb_classes = 10  # 10类  
2.    
3.  (X_train, y_train), (X_test, y_test) = mnist.load_data()  
4.    
5.  X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)  
6.  X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)  
7.    
8.  # 归一化  
9.  X_train = X_train.astype("float32")/255.  
10.  X_test = X_test.astype("float32")/255.  
11.  print('X_train shape:', X_train.shape)  
12.  print(X_train.shape[0], 'train samples')  
13.  print(X_test.shape[0], 'test samples')  
14.    
15.  y_train = np_utils.to_categorical(y_train, nb_classes)  
16.  y_test = np_utils.to_categorical(y_test, nb_classes)

2) 卷积自编码器建模

1.  x = Input(shape=(28, 28,1))     
2.      
3.  # 编码器    
4.  conv1_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x)    
5.  pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)    
6.  conv1_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool1)    
7.  pool2 = MaxPooling2D((2, 2), padding='same')(conv1_2)    
8.  conv1_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool2)    
9.  h = MaxPooling2D((2, 2), padding='same')(conv1_3)    
10.      
11.      
12.  # 解码器    
13.  conv2_1 = Conv2D(8, (3, 3), activation='relu', padding='same')(h)    
14.  up1 = UpSampling2D((2, 2))(conv2_1)    
15.  conv2_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(up1)    
16.  up2 = UpSampling2D((2, 2))(conv2_2)    
17.  conv2_3 = Conv2D(16, (3, 3), activation='relu')(up2)    
18.  up3 = UpSampling2D((2, 2))(conv2_3)    
19.  r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up3)    
20.      
21.  autoencoder = Model(inputs=x, outputs=r)    
22.  autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

    

3) 训练

1.  epochs = 3  
2.  batch_size = 128  
3.    
4.  history = autoencoder.fit(X_train, X_train,   
5.                            batch_size=batch_size,   
6.                            epochs=epochs, verbose=1,   
7.                            validation_data=(X_test, X_test)  
8.                           )

4) 查看解码效果

1.  decoded_imgs = autoencoder.predict(X_test)  
2.    
3.  n = 10  
4.  plt.figure(figsize=(20, 6))  
5.  for i in range(n):  
6.      # 原图  
7.      ax = plt.subplot(3, n, i+1)  
8.      plt.imshow(X_test[i].reshape(28, 28))  
9.      plt.gray()  
10.      ax.get_xaxis().set_visible(False)  
11.      ax.get_yaxis().set_visible(False)  
12.    
13.        
14.      # 解码效果图  
15.      ax = plt.subplot(3, n, i+n+1)  
16.      plt.imshow(decoded_imgs[i].reshape(28, 28))  
17.      plt.gray()  
18.      ax.get_xaxis().set_visible(False)  
19.      ax.get_yaxis().set_visible(False)  
20.        
21.  plt.show()

5) 训练过程可视化

1.  print(history.history.keys())  
2.    
3.  plt.plot(history.history['loss'])  
4.  plt.plot(history.history['val_loss'])  
5.  plt.title('model loss')  
6.  plt.ylabel('loss')  
7.  plt.xlabel('epoch')  
8.  plt.legend(['train', 'validation'], loc='upper right')  
9.  plt.show()  

dict_keys(['val_loss', 'loss'])

4. 稀疏正则自编码器

1) 读取数据集

1.  (X_train, _), (X_test, _) = mnist.load_data()  
2.    
3.  #  归一化  
4.  X_train = X_train.astype("float32")/255.  
5.  X_test = X_test.astype("float32")/255.  
6.    
7.  print('X_train shape:', X_train.shape)  
8.  print(X_train.shape[0], 'train samples')  
9.  print(X_test.shape[0], 'test samples')  

X_train shape: (60000, 28, 28)

60000 train samples

10000 test samples

1.  # np.prod是将28X28矩阵转化成1X784,方便全连接神经网络输入层784个神经元读取。  
2.  X_train = X_train.reshape((len(X_train), np.prod(X_train.shape[1:])))  
3.  X_test = X_test.reshape((len(X_test), np.prod(X_test.shape[1:])))  

2) 稀疏正则自编码器建模

1.  input_size = 784  
2.  hidden_size = 32  
3.  output_size = 784  
4.    
5.  x = Input(shape=(input_size,))  
6.  h = Dense(hidden_size, activation='relu', activity_regularizer=regularizers.l1(10e-5))(x)  
7.  r = Dense(output_size, activation='sigmoid')(h)  
8.    
9.  autoencoder = Model(inputs=x, outputs=r)  
10.  autoencoder.compile(optimizer='adam', loss='mse')  

3) 训练

1.  epochs = 15  
2.  batch_size = 128  
3.    
4.  history = autoencoder.fit(X_train, X_train,   
5.                            batch_size=batch_size,   
6.                            epochs=epochs,   
7.                            verbose=1,   
8.                            validation_data=(X_test, X_test)  
9.                           )  

4) 查看解码效果

1.  n = 10  
2.  plt.figure(figsize=(20, 6))  
3.  for i in range(n):  
4.      # 原图  
5.      ax = plt.subplot(3, n, i+1)  
6.      plt.imshow(X_test[i].reshape(28, 28))  
7.      plt.gray()  
8.      ax.get_xaxis().set_visible(False)  
9.      ax.get_yaxis().set_visible(False)  
10.    
11.        
12.      # 解码效果图  
13.      ax = plt.subplot(3, n, i+n+1)  
14.      plt.imshow(decoded_imgs[i].reshape(28, 28))  
15.      plt.gray()  
16.      ax.get_xaxis().set_visible(False)  
17.      ax.get_yaxis().set_visible(False)  
18.        
19.  plt.show()  

5) 训练过程可视化

1.  print(history.history.keys())  
2.    
3.  plt.plot(history.history['loss'])  
4.  plt.plot(history.history['val_loss'])  
5.  plt.title('model loss')  
6.  plt.ylabel('loss')  
7.  plt.xlabel('epoch')  
8.  plt.legend(['train', 'validation'], loc='upper right')  
9.  plt.show()  

dict_keys(['val_loss', 'loss'])

5. 去噪自编码器

1) 读取数据集

1.  (X_train, _), (X_test, _) = mnist.load_data()  
2.    
3.  X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)  
4.  X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)  
5.    
6.  X_train = X_train.astype("float32")/255.  
7.  X_test = X_test.astype("float32")/255.  

2) 加噪

1.  noise_factor = 0.5  
2.  X_train_noisy = X_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=X_train.shape)   
3.  X_test_noisy = X_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=X_test.shape)   
4.    
5.  X_train_noisy = np.clip(X_train_noisy, 0., 1.)  
6.  X_test_noisy = np.clip(X_test_noisy, 0., 1.)  

3) 去噪自编码器建模

1.  x = Input(shape=(28, 28, 1))  
2.    
3.  # 编码器  
4.  conv1_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(x)  
5.  pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)  
6.  conv1_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(pool1)  
7.  h = MaxPooling2D((2, 2), padding='same')(conv1_2)  
8.    
9.    
10.  # 解码器  
11.  conv2_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(h)  
12.  up1 = UpSampling2D((2, 2))(conv2_1)  
13.  conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1)  
14.  up2 = UpSampling2D((2, 2))(conv2_2)  
15.  r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up2)  
16.    
17.  autoencoder = Model(inputs=x, outputs=r)  
18.  autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy') 

 

4) 训练

1.  epochs = 3  
2.  batch_size = 128  
3.    
4.  history = autoencoder.fit(X_train_noisy, X_train,   
5.                            batch_size=batch_size,  
6.                            epochs=epochs, verbose=1,  
7.                            validation_data=(X_test_noisy, X_test))  

5) 查看解码效果

1.  decoded_imgs = autoencoder.predict(X_test_noisy)  
2.    
3.  n = 10  
4.  plt.figure(figsize=(20, 6))  
5.  for i in range(n):  
6.      # 原图  
7.      ax = plt.subplot(3, n, i+1)  
8.      plt.imshow(X_test_noisy[i].reshape(28, 28))  
9.      plt.gray()  
10.      ax.get_xaxis().set_visible(False)  
11.      ax.get_yaxis().set_visible(False)  
12.    
13.        
14.      # 解码效果图  
15.      ax = plt.subplot(3, n, i+n+1)  
16.      plt.imshow(decoded_imgs[i].reshape(28, 28))  
17.      plt.gray()  
18.      ax.get_xaxis().set_visible(False)  
19.      ax.get_yaxis().set_visible(False)  
20.        
21.  plt.show()  

6) 训练过程可视化

1.  print(history.history.keys())  
2.    
3.  plt.plot(history.history['loss'])  
4.  plt.plot(history.history['val_loss'])  
5.  plt.title('model loss')  
6.  plt.ylabel('loss')  
7.  plt.xlabel('epoch')  
8.  plt.legend(['train', 'validation'], loc='upper right')  
9.  plt.show()  

dict_keys(['val_loss', 'loss'])

下一期,我们将介绍

代码实践|LSTM实例之作诗机器人

敬请期待~

 

关注我的微信公众号~不定期更新相关专业知识~

内容 |阿力阿哩哩 

编辑 | 阿璃 

点个“在看”,作者高产似那啥~

发布了76 篇原创文章 · 获赞 5 · 访问量 6219

猜你喜欢

转载自blog.csdn.net/Chile_Wang/article/details/104368247