ex2, logistic regression

Introduction:

  In this exercise, you will realize logistic regression, and apply it to two different sets of data. Before you start programming practice, we strongly recommend watching the video lectures and complete the questions related topics. To start practicing, you need to download the initial code and extract its contents to a directory to complete the exercise. If necessary, change to this directory using octave / matlab cd commands before starting this exercise. You can also find instructions for installing Octave / Matlab in "Environment setup instructions" course website.

Files contained in this exercise

ex2.m-octave / matlab script to guide you through the exercises

ex2 reg.m-octave / matlab script to the rear part of the exercise

Half of the training set of exercises before ex2data1.txt-

ex2data2.txt- practice the latter part of the training set

The solutions submitted script submit.m- sent to our server

map.Feature.m- generator polynomial function

plotDecisionBoundary.m - function of decision boundaries drawn classifier


[*] PlotData.m - Function to plot 2D classi fi cation data ( two-dimensional mapping function for the classification data)
[*] sigmoid.m - the Sigmoid Function (S-shaped function logic function)
[*] costFunction.m - Logistic Regression Cost function (logistic regression cost function)
[*] predict.m - prediction Logistic regression function (logistic regression prediction function)
[*] costFunctionReg.m - a regularized Logistic regression Cost (regularized logistic regression cost function)
* indicates Fi Les Will you need to complete

  Throughout the exercise, you will use a script ex2.m and ex2 reg.m. The script has established a set of data on the issue, and calls the function you want to write a. You do not need to modify any one of them. You are only

Modify other file functions Follow the instructions in this task.

1 logistic regression

In this part of the exercise, you will build a logistic regression model to predict whether a student is admitted to the University. Suppose you are a university dean, you want to determine their enrollment opportunities based on each applicant's performance in the two exams. You have a previous history data of the applicant, may be used as a training set of logistic regression for each training example, you have a fraction of applicants and admitted in two sittings decision. Your task is to build a classification model to estimate the probability of the applicant's admission scores according to these two examinations . Outline and framework code ex2.m will guide you through this exercise.

2, the regular logistic regression

  In this part of the exercise, you will achieve a positive return to the logic of, to predict whether a microchip from manufacturing plants by the Quality Assurance (QA). During the warranty period, each microchip go through various tests to make sure it is working properly. Suppose you are a product manager for the plant, you have two different tests microchip test results. From these two tests, you want to determine whether to accept or reject the microchip. To help you make a decision, you have a chip on the last test result data set, from which you can build a logistic regression model.

  You will use another script, ex2_reg.m have this exercise requires you to complete part of.

 

OPTIONAL

In this part of the exercise, you will try different regularization parameter data set, in order to understand how to prevent over-regularization parameter fitting. Note that the decision of the boundary changes when you change λ. With a small λ, you should find a category who can correctly get almost every training example, but draw a very complex boundaries, so too much data set (Fig. 5). This is not a good decision boundary: for example, it predicts x = - a point (0.25, 1.5) are accepted at (y = 1), taking into account the training set, which seems to be a wrong decision.

For larger λ, you should see a chart that shows a simpler decision boundary can still be well separated from the pros and cons. However, if the λ value is set too high, you will not get a good result and a good decision boundary does not follow the data, resulting in insufficient data set (Fig. 6).

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/weststar/p/11670562.html