☆1绘制learning curve时需要注意
在绘制error_train 与 error_cv随着样本数目的增加而变化的learning curve
需要根据当前循环内的样本数目i来求解对应的训练出的theta值,再供以求解J_train与J_cv。同时需要注意的是,J_train处的求解m对应的是i,是样本集的子集,而每一次的J_cv的求解对应的都是整个数据集。
意思也就是,当随着样本数量不断增加,训练得到的theta值将不断变化,而cv验证集需要去验证当前样本个数下的theta值的效果如何,当然需要整个验证数据集一起求得误差val_error。其中,train_error也就是当前使用的样本数目下的误差。
☆2计算error时需注意
λ=0
上代码
function [J, grad] = linearRegCostFunction(X, y, theta, lambda) m = length(y); % number of training examples % You need to return the following variables correctly J = 0; grad = zeros(size(theta)); %n*1 % ====================== YOUR CODE HERE ====================== h=X*theta; %theta=n*1,X=m*n h=m*1 J= 1/2/m*sum( (h-y).^2 ) + lambda/2/m*sum(theta(2:end,:).^2); grad(2:end)=1/m * (X'*(h-y))(2:end) + lambda/m*theta(2:end); grad(1)=1/m * (X'*(h-y))(1); % ========================================================================= grad = grad(:); end
function [error_train, error_val] = ... learningCurve(X, y, Xval, yval, lambda) m = size(X, 1); % You need to return these values correctly error_train = zeros(m, 1); error_val = zeros(m, 1); % ====================== YOUR CODE HERE ====================== for i=1:m sample_x=X(1:i, :); sample_y=y(1:i); [theta] = trainLinearReg(sample_x, sample_y, lambda); [J, grad] = linearRegCostFunction(sample_x, sample_y, theta, 0); error_train(i)=J; [J, grad] = linearRegCostFunction(Xval, yval, theta, 0) ; error_val(i)=J; end
function [X_poly] = polyFeatures(X, p) X_poly = zeros(numel(X), p); % ====================== YOUR CODE HERE ====================== for i=1:p X_poly(:,i)=X.^i; end % ========================================================================= end
function [lambda_vec, error_train, error_val] = ... validationCurve(X, y, Xval, yval) lambda_vec = [0 0.001 0.003 0.01 0.03 0.1 0.3 1 3 10]'; % You need to return these variables correctly. error_train = zeros(length(lambda_vec), 1); error_val = zeros(length(lambda_vec), 1); % ====================== YOUR CODE HERE ====================== for i = 1:length(lambda_vec) lambda = lambda_vec(i); theta=trainLinearReg(X, y, lambda); [error_train(i),grad]=linearRegCostFunction(X, y,theta, 0); [error_val(i),grad]=linearRegCostFunction( Xval, yval , theta, 0); end