主要更改模型构建和训练部分
在这里使用sklearn里面的RandomizedSearchCV来实现超参数的随机化搜索,因为要使用的RandomizedSearchCV是sklearn里面的函数,所以要先把tf.keras.model转化成sklearn的形式的model,随后定义参数集合,然后使用RandomizedSearchCV去搜索参数。
RandomizedSearchCV实现超参数的随机化搜索
1.转化为sklearn的Model
将tf.keras.model转化为sklearn支持的格式,先定义好一个tf.keras.model,然后调用一个函数,把tf.keras.model封装成一个sklearn的model。封装函数是:tf.keras.wrappers.scikit_learn.KerasClassifier或tf.keras.wrappers.scikit_learn.KerasRegressor, 这两个函数接受的参数是build_fn:build_fn是一个函数,这个函数返回的是一个搭建好的tf.keras.model。
思路:
定义一个build_fn -> 对这个build_fn调用KerasRegressor -> 得到一个sklearn model形式
# 1. 转化为sklearn的model
def build_model(hidden_layers = 1,
layer_size = 30,
learning_rate = 3e-3):
model = keras.models.Sequential()
model.add(keras.layers.Dense(layer_size, activation='relu',
input_shape=x_train.shape[1:]))
for _ in range(hidden_layers - 1):
model.add(keras.layers.Dense(layer_size,
activation = 'relu'))
model.add(keras.layers.Dense(1))
optimizer = keras.optimizers.SGD(learning_rate)
model.compile(loss = 'mse', optimizer = optimizer)
return model
sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(
build_fn = build_model)
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]
history = sklearn_model.fit(x_train_scaled, y_train,
epochs = 10,
validation_data = (x_valid_scaled, y_valid),
callbacks = callbacks)
2.定义参数集合
要搜索的参数是:hidden_layers, layer_size , learning_rate
参数集合就是这三个参数所取的值
相关部分代码展示:
from scipy.stats import reciprocal
# f(x) = 1/(x*log(b/a)) a <= x <= b
param_distribution = {
"hidden_layers":[1, 2, 3, 4],
"layer_size": np.arange(1, 100),
"learning_rate": reciprocal(1e-4, 1e-2),
#连续取值,设置最大值和最小值即可。使用一个分布生成learning_rate
}
3.搜索参数
相关部分代码展示:
from sklearn.model_selection import RandomizedSearchCV
random_search_cv = RandomizedSearchCV(sklearn_model,
param_distribution,
n_iter = 10,#从param_distribution里sample出多少个参数集合
cv = 3,
n_jobs = 1) #有多少任务并行处理
random_search_cv.fit(x_train_scaled, y_train, epochs =100,
validation_data = (x_valid_scaled, y_valid),
callbacks = callbacks)
在超参数搜索时有cross_validation机制: 训练集分成n份,n-1训练,最后一份验证.
在超参数搜索完之后,会在全部训练集上用新的参数再训练一遍
附完整代码:
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf
from tensorflow import keras
print(tf.__version__)
print(sys.version_info)
for module in mpl, np, pd, sklearn, tf, keras:
print(module.__name__, module.__version__)
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
print(housing.DESCR)
print(housing.data.shape)
print(housing.target.shape)
from sklearn.model_selection import train_test_split
x_train_all, x_test, y_train_all, y_test = train_test_split(
housing.data, housing.target, random_state = 7)
x_train, x_valid, y_train, y_valid = train_test_split(
x_train_all, y_train_all, random_state = 11)
print(x_train.shape, y_train.shape)
print(x_valid.shape, y_valid.shape)
print(x_test.shape, y_test.shape)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)
# RandomizedSearchCV
# 1. 转化为sklearn的model
# 2. 定义参数集合
# 3. 搜索参数
def build_model(hidden_layers = 1,
layer_size = 30,
learning_rate = 3e-3):
model = keras.models.Sequential()
model.add(keras.layers.Dense(layer_size, activation='relu',
input_shape=x_train.shape[1:]))
for _ in range(hidden_layers - 1):
model.add(keras.layers.Dense(layer_size,
activation = 'relu'))
model.add(keras.layers.Dense(1))
optimizer = keras.optimizers.SGD(learning_rate)
model.compile(loss = 'mse', optimizer = optimizer)
return model
sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(
build_fn = build_model)
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]
history = sklearn_model.fit(x_train_scaled, y_train,
epochs = 10,
validation_data = (x_valid_scaled, y_valid),
callbacks = callbacks)
def plot_learning_curves(history):
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1)
plt.show()
plot_learning_curves(history)
from scipy.stats import reciprocal
# f(x) = 1/(x*log(b/a)) a <= x <= b
param_distribution = {
"hidden_layers":[1, 2, 3, 4],
"layer_size": [5, 10, 20, 30],
"learning_rate": [1e-4, 5e-5, 1e-3, 5e-3, 1e-2],
# "layer_size": np.arange(1, 100),
# "learning_rate": reciprocal(1e-4, 1e-2),
#连续取值,设置最大值和最小值即可。使用一个分布生成learning_rate
}
from sklearn.model_selection import RandomizedSearchCV
random_search_cv = RandomizedSearchCV(sklearn_model,
param_distribution,
n_iter = 10,#从param_distribution里sample出多少个参数集合
cv = 3,
n_jobs = 1) #有多少任务并行处理
random_search_cv.fit(x_train_scaled, y_train, epochs =100,
validation_data = (x_valid_scaled, y_valid),
callbacks = callbacks)
# 在超参数搜索时有cross_validation机制: 训练集分成n份,n-1训练,最后一份验证.
# 在超参数搜索完之后,会在全部训练集上用新的参数再训练一遍
print(random_search_cv.best_params_) #最好参数
print(random_search_cv.best_score_) #最好分值
print(random_search_cv.best_estimator_) #最好的model
model = random_search_cv.best_estimator_.model #获取model,在test上测试
model.evaluate(x_test_scaled, y_test)
note: RuntimeError: Cannot clone object <tensorflow.python.keras.wrappers.scikit_learn.KerasRegressor object at 0x000002416E1B6288>, as the constructor either does not set or modifies parameter layer_size
本文代码测试环境信息:
2.0.0
sys.version_info(major=3, minor=7, micro=6, releaselevel=‘final’, serial=0)
matplotlib 3.1.1
numpy 1.17.4
pandas 1.0.0
sklearn 0.22.1
tensorflow 2.0.0
tensorflow_core.keras 2.2.4-tf
解决方法:1、返回到sklearn 0.21.3版本可以工作。
2、在param_distribution中使用普通列表(可能是新版本tensorflow中的KerasRegressor的问题,新版本的参数在做
deep copy的时候出现了问题,导致在拷贝复杂的numpy对象的时候出错,如果把搜索的参数改成普通列表则不会出错)
param_distribution = {
"hidden_layers": [1, 2, 3, 4],
"layer_size": [5, 10, 20, 30],
"learning_rate": [1e-4, 5e-5, 1e-3, 5e-3, 1e-2],
}
因为deep copy这个操作对用户不可见,所以结论是,退回到sklearn 0.21.3版本,或者在param_distribution中使用普通列表。