实战sklearn超参数搜索(随机化)

主要更改模型构建和训练部分

在这里使用sklearn里面的RandomizedSearchCV来实现超参数的随机化搜索,因为要使用的RandomizedSearchCV是sklearn里面的函数,所以要先把tf.keras.model转化成sklearn的形式的model,随后定义参数集合,然后使用RandomizedSearchCV去搜索参数。

RandomizedSearchCV实现超参数的随机化搜索

1.转化为sklearn的Model

       将tf.keras.model转化为sklearn支持的格式,先定义好一个tf.keras.model,然后调用一个函数,把tf.keras.model封装成一个sklearn的model。封装函数是:tf.keras.wrappers.scikit_learn.KerasClassifier或tf.keras.wrappers.scikit_learn.KerasRegressor, 这两个函数接受的参数是build_fn:build_fn是一个函数,这个函数返回的是一个搭建好的tf.keras.model。

思路:

定义一个build_fn -> 对这个build_fn调用KerasRegressor -> 得到一个sklearn model形式

# 1. 转化为sklearn的model
def build_model(hidden_layers = 1,
                layer_size = 30,
                learning_rate = 3e-3):
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(layer_size, activation='relu',
                                 input_shape=x_train.shape[1:]))
    for _ in range(hidden_layers - 1):
        model.add(keras.layers.Dense(layer_size,
                                     activation = 'relu'))
    model.add(keras.layers.Dense(1))
    optimizer = keras.optimizers.SGD(learning_rate)
    model.compile(loss = 'mse', optimizer = optimizer)
    return model

sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(
    build_fn = build_model)
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]
history = sklearn_model.fit(x_train_scaled, y_train,
                            epochs = 10,
                            validation_data = (x_valid_scaled, y_valid),
                            callbacks = callbacks)

2.定义参数集合

要搜索的参数是:hidden_layers,  layer_size ,  learning_rate

参数集合就是这三个参数所取的值

相关部分代码展示:

from scipy.stats import reciprocal
# f(x) = 1/(x*log(b/a)) a <= x <= b

param_distribution = {
    "hidden_layers":[1, 2, 3, 4],
    "layer_size": np.arange(1, 100),
    "learning_rate": reciprocal(1e-4, 1e-2), 
    #连续取值,设置最大值和最小值即可。使用一个分布生成learning_rate
}

3.搜索参数

相关部分代码展示:
 

from sklearn.model_selection import RandomizedSearchCV

random_search_cv = RandomizedSearchCV(sklearn_model,
                                      param_distribution,
                                      n_iter = 10,#从param_distribution里sample出多少个参数集合
                                      cv = 3,
                                      n_jobs = 1) #有多少任务并行处理
random_search_cv.fit(x_train_scaled, y_train, epochs =100,
                     validation_data = (x_valid_scaled, y_valid),
                     callbacks = callbacks)

 在超参数搜索时有cross_validation机制: 训练集分成n份,n-1训练,最后一份验证.
 在超参数搜索完之后,会在全部训练集上用新的参数再训练一遍

附完整代码:

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl, np, pd, sklearn, tf, keras:
    print(module.__name__, module.__version__)

from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
print(housing.DESCR)
print(housing.data.shape)
print(housing.target.shape)

from sklearn.model_selection import train_test_split

x_train_all, x_test, y_train_all, y_test = train_test_split(
    housing.data, housing.target, random_state = 7)
x_train, x_valid, y_train, y_valid = train_test_split(
    x_train_all, y_train_all, random_state = 11)
print(x_train.shape, y_train.shape)
print(x_valid.shape, y_valid.shape)
print(x_test.shape, y_test.shape)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)

# RandomizedSearchCV
# 1. 转化为sklearn的model
# 2. 定义参数集合
# 3. 搜索参数

def build_model(hidden_layers = 1,
                layer_size = 30,
                learning_rate = 3e-3):
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(layer_size, activation='relu',
                                 input_shape=x_train.shape[1:]))
    for _ in range(hidden_layers - 1):
        model.add(keras.layers.Dense(layer_size,
                                     activation = 'relu'))
    model.add(keras.layers.Dense(1))
    optimizer = keras.optimizers.SGD(learning_rate)
    model.compile(loss = 'mse', optimizer = optimizer)
    return model

sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(
    build_fn = build_model)
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]
history = sklearn_model.fit(x_train_scaled, y_train,
                            epochs = 10,
                            validation_data = (x_valid_scaled, y_valid),
                            callbacks = callbacks)

def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True)
    plt.gca().set_ylim(0, 1)
    plt.show()
plot_learning_curves(history)


from scipy.stats import reciprocal
# f(x) = 1/(x*log(b/a)) a <= x <= b

param_distribution = {
    "hidden_layers":[1, 2, 3, 4],
    "layer_size": [5, 10, 20, 30],
    "learning_rate": [1e-4, 5e-5, 1e-3, 5e-3, 1e-2],
#     "layer_size": np.arange(1, 100),
#     "learning_rate": reciprocal(1e-4, 1e-2), 
    #连续取值,设置最大值和最小值即可。使用一个分布生成learning_rate
}

from sklearn.model_selection import RandomizedSearchCV

random_search_cv = RandomizedSearchCV(sklearn_model,
                                      param_distribution,
                                      n_iter = 10,#从param_distribution里sample出多少个参数集合
                                      cv = 3,
                                      n_jobs = 1) #有多少任务并行处理
random_search_cv.fit(x_train_scaled, y_train, epochs =100,
                     validation_data = (x_valid_scaled, y_valid),
                     callbacks = callbacks)

# 在超参数搜索时有cross_validation机制: 训练集分成n份,n-1训练,最后一份验证.
# 在超参数搜索完之后,会在全部训练集上用新的参数再训练一遍

print(random_search_cv.best_params_) #最好参数
print(random_search_cv.best_score_) #最好分值
print(random_search_cv.best_estimator_) #最好的model

model = random_search_cv.best_estimator_.model #获取model,在test上测试
model.evaluate(x_test_scaled, y_test)

note: RuntimeError: Cannot clone object <tensorflow.python.keras.wrappers.scikit_learn.KerasRegressor object at 0x000002416E1B6288>, as the constructor either does not set or modifies parameter layer_size

本文代码测试环境信息:
2.0.0
sys.version_info(major=3, minor=7, micro=6, releaselevel=‘final’, serial=0)
matplotlib 3.1.1
numpy 1.17.4
pandas 1.0.0
sklearn 0.22.1
tensorflow 2.0.0
tensorflow_core.keras 2.2.4-tf

解决方法:1、返回到sklearn 0.21.3版本可以工作。

                 2、在param_distribution中使用普通列表(可能是新版本tensorflow中的KerasRegressor的问题,新版本的参数在做

deep copy的时候出现了问题,导致在拷贝复杂的numpy对象的时候出错,如果把搜索的参数改成普通列表则不会出错)

param_distribution = {
    "hidden_layers": [1, 2, 3, 4],
    "layer_size": [5, 10, 20, 30],
    "learning_rate": [1e-4, 5e-5, 1e-3, 5e-3, 1e-2],
}

因为deep copy这个操作对用户不可见,所以结论是,退回到sklearn 0.21.3版本,或者在param_distribution中使用普通列表。

原创文章 46 获赞 49 访问量 2194

猜你喜欢

转载自blog.csdn.net/qq_41660119/article/details/105765589
今日推荐