Model tuning tool--hyperopt

I. Introduction

In machine learning, model training takes a lot of time. Each algorithm needs to configure a different number of hyperparameters before training, and the parameters have a considerable impact on the training results.
Therefore, the optimization of hyperparameters is a very important, but time-consuming and labor-intensive task.
Hyperopt provides an optimization interface that receives an evaluation function and parameter space, and can calculate the loss function value of a point in the space, which simplifies the parameter adjustment process.

2. Actual combat

Familiarize yourself with the use of the hyper library through the example of using a perceptron to discriminate iris data.

1. Data reading and standardization

from sklearn import datasets
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)


from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

The above code divides the dataset into training and testing sets and preprocesses them.

2. Normal Perceptron

from sklearn.linear_model import Perceptron
ppn = Perceptron(n_iter=40, eta0=0.1, random_state=0)
ppn.fit(X_train_std, y_train)

y_pred = ppn.predict(X_test_std)
print accuracy_score(y_test, y_pred)

Here, the perceptron parameter n_iter is set to 40, eta0 is 0.1, the model is fitted, and the final prediction accuracy is: 0.822222222222.
It can be seen that the current accuracy rate is not satisfactory. Under normal circumstances, we need to manually change the parameters to continue model training.

3. Perceptron using hyperopt

First define the evaluation function, which is the perceptron model used in 2. Since the fmin function seeks the minimum value, the negative auc value is taken.

def percept(args):
    global X_train_std, y_train, y_test
    ppn = Perceptron(n_iter=int(args["n_iter"]), eta0=args["eta"])
    ppn.fit(X_train_std, y_train)
    y_pred = ppn.predict(X_test_std)
    return -accuracy_score(y_test, y_pred)

Next, the parameter space is defined, and the expressions recognized by the hp optimization algorithm are as follows:

hp.choice(label, options)

Return one of the options, which should be a list or tuple. The options element itself can be [nested] random expressions. In this case, stochastic choices that only appear in some options will become conditional arguments.
hp.randint(label, upper)

Returns a random integer in the range: [0, upper]. The semantics of this distribution is that it means that the loss function between adjacent integer values has no more correlation, compared to more distant integer values. For example, this is an appropriate distribution to describe a random seed. If the loss function is likely to be more related to adjacent integer values, then you should probably use one of the "quantized" continuous distributions, such as quniform , qloguniform , qnormal or qlognormal .
hp.uniform(label, low, high)

Returns a uniformly distributed value between [low,hight].
During optimization, this variable is restricted to a two-sided interval.
hp.quniform(label, low, high, q)

Returns a value like round(uniform(low,high)/q)*q
for discrete values where the target is still somewhat "smooth", but with boundaries (two-sided intervals) above and below it.
hp.loguniform(label, low, high)

Returns the value plotted against exp(uniform(low, high)) so that the logarithm of the returned value is uniformly distributed.
During optimization, the variable is restricted to the interval [exp(low), exp(high)].
hp.qloguniform(label, low, high, q)

Returns a value like round(exp(uniform(low,high))/q)*q
for a discrete variable whose goal is to "smooth" and become smoother with the magnitude of the value, but with bounds above and below it (two-sided interval).
hp.normal(label, mu, sigma)

Returns a normally distributed real value with mean mu and standard deviation σ. When optimizing, this is an unconstrained variable.
hp.qnormal(label, mu, sigma, q)

Returns a value like round(normal(mu, sigma)/q)*q
for discrete values, may require values around mu, but is basically unbounded.
hp.lognormal(label, mu, sigma) (lognormal distribution)

Returns the values plotted against exp(normal(mu, sigma)) so that the lognormal distribution of the values is returned. When optimizing, this variable is restricted to positive values.
hp.qlognormal(label, mu, sigma, q)

Returns a value like round(exp(normal(mu,sigma))/q)*q,
applied to a discrete variable whose goal is to "smooth" and become smoother with the magnitude of the variable ranging from A boundary begins (one-sided interval).
```
from hyperopt import fmin, tpe, hp

space = {"n_iter": hp.randint("n_iter", 50), 
        "eta": hp.uniform("eta", 0.05, 0.5)}
```
Finally, the parameter selection of the model
```
best = fmin(percept, space, algo=tpe.suggest, max_evals=100)
print best
print percept(best)
```
algo is a search algorithm selected by user parameters. Currently, the search algorithms supported by hyperopt include random search (corresponding to hyperopt.rand.suggest), simulated annealing (corresponding to hyperopt.anneal.suggest), and TPE algorithm.
max_evals is the number of model evaluations.
Through the above code, the final obtained perceptron model parameters are: {'n_iter': 29, 'eta': 0.10959845686587069}, the
correct rate: -0.977777777778
Compared with the parameters selected without hyperopt, the parameters are improved a lot, and the adjustment is also reduced. Difficulty of participating.