Machine learning code combat-linear regression (multivariable) (Linear Regression)

1. Experimental Purpose

Hiring.csv contains the company's recruitment information, such as the candidate's work experience, written test results and personal interview results. Based on these three factors, human resources will determine wages. With this data, you need to build a machine learning model for the human resources department to help them determine the salary of future candidates. Use this predicted salary to predict the salary of the following candidates,

(1) 2 years work experience, test score 9, interview score 6
(2) 12 years work experience, test score 10, interview score 10

2. Import the necessary modules and read the data

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from word2number import w2n

df = pd.read_csv('hiring.csv')
df

Insert picture description here

3. Process the data

3.1. Experience field digitization

df.experience = df.experience.fillna('zero')      #NaN统一替换为zero
df

Insert picture description here

df.experience = df.experience.apply(w2n.word_to_num)    #运用w2n.word_to_num将字母转化为数字
df

Insert picture description here

3.2. Test_score (out of 10) field NaN is replaced with average

import math

median_test_score = math.floor(df['test_score(out of 10)'].mean())   #取平均数并向下取整
median_test_score

#输出
7
df['test_score(out of 10)'] = df['test_score(out of 10)'].fillna(median_test_score)    #用平均数填充NaN
df

Insert picture description here

4. Training + prediction

reg = LinearRegression()    #实例化模型
reg.fit(df[['experience','test_score(out of 10)','interview_score(out of 10)']],df['salary($)'])   #训练

reg.coef_     #系数
reg.intercept_   #截距

reg.predict([[2,9,6]])    #预测一
reg.predict([[12,10,10]])  #预测二

Insert picture description here

Published 227 original articles · praised 633 · 30,000+ views

Guess you like

Origin blog.csdn.net/weixin_37763870/article/details/105416553