2019118_四个化学数据分析(3)

import pandas as pd
from pylab import mpl
mpl.rcParams['font.sans-serif'] = ['FangSong'] # 指定默认字体
mpl.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题
import matplotlib.pyplot as plt
%matplotlib inline
test=pd.read_excel('数据.xlsx')
test.head()
温度/℃ 0.001 0.005 0.01 0.02 0.04 0.06 0.08 0.1 0.5
0 0 1.0002 1.0002 1.0002 1.0002 1.0002 1.0002 1.0002 1.0002 1.0000
1 20 135.2300 1.0017 1.0017 1.0017 1.0017 1.0017 1.0017 1.0017 1.0015
2 40 144.4700 28.8600 1.0078 1.0078 1.0078 1.0078 1.0078 1.0078 1.0076
3 60 153.7100 30.7100 15.3400 1.0710 1.0710 1.0710 1.0710 1.0710 1.0169
4 80 162.9500 32.5700 16.2700 8.1190 4.0440 1.0292 1.0292 1.0292 1.0290
test.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35 entries, 0 to 34
Data columns (total 10 columns):
温度/℃     35 non-null int64
0.001    35 non-null float64
0.005    35 non-null float64
0.01     35 non-null float64
0.02     35 non-null float64
0.04     35 non-null float64
0.06     35 non-null float64
0.08     35 non-null float64
0.1      35 non-null float64
0.5      35 non-null float64
dtypes: float64(9), int64(1)
memory usage: 2.8 KB
test.hist(figsize=(20,10))
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87C47BA8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87CB1550>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87CCBBE0>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87CFD278>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87D22908>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87D22940>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87D7F668>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87DA6CF8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87DD93C8>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87E01A58>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87E33128>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000026E87E5B7B8>]],
      dtype=object)

在这里插入图片描述
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-niQ1ySLM-1574997331548)(output_4_1.png)]

预测温度/℃

y = test['温度/℃']
X = test.drop(['温度/℃'],axis=1)
print('data shape: {0}; no. positive: {1}; no. negative: {2}'.format(
    X.shape, y[y==1].shape[0], y[y==0].shape[0]))
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)
data shape: (35, 9); no. positive: 0; no. negative: 1

基于xgboost的预测算法

from sklearn import linear_model
model =XGBRegressor(max_depth = 2)
model.fit(X_train, y_train)
train_score = model.score(X_train, y_train)
test_score = model.score(X_test, y_test)
print('train score: {train_score:.6f}; test score: {test_score:.6f}'.format(
    train_score=train_score, test_score=test_score))
D:\anaconda\lib\site-packages\xgboost\core.py:587: FutureWarning: Series.base is deprecated and will be removed in a future version
  if getattr(data, 'base', None) is not None and \


[20:26:54] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
train score: 0.999972; test score: 0.990900
发布了76 篇原创文章 · 获赞 23 · 访问量 1万+

猜你喜欢

转载自blog.csdn.net/qq_39309652/article/details/103307304
今日推荐