【机器学习】C++ 从零实现神经网络（三）神经网络的训练和测试

长文预警: 共22727字

注意：文末附有所有源码的地址

建议：收藏后找合适时间阅读。
在这里插入图片描述

三、神经网络的训练和测试

前言

在之前的文章中我们已经实现了Net类的设计和前向传播和反向传播的过程。可以说神经网络的核心的部分已经完成。接下来就是应用层面了。

要想利用神经网络解决实际的问题，比如说进行手写数字的识别，需要用神经网络对样本进行迭代训练，训练完成之后，训练得到的模型是好是坏，我们需要对之进行测试。这正是我们现在需要实现的部分的内容。

完善后的Net类

需要知道的是现在的Net类已经相对完善了，为了实现接下来的功能，不论是成员变量还是成员函数都变得更加的丰富。现在的Net类看起来是下面的样子：

class Net
    {
    public:
        //Integer vector specifying the number of neurons in each layer including the input and output layers.
        std::vector<int> layer_neuron_num;
        std::string activation_function = "sigmoid";
        double learning_rate; 
        double accuracy = 0.;
        std::vector<double> loss_vec;
        float fine_tune_factor = 1.01;
    protected:
        std::vector<cv::Mat> layer;
        std::vector<cv::Mat> weights;
        std::vector<cv::Mat> bias;
        std::vector<cv::Mat> delta_err;

        cv::Mat output_error;
        cv::Mat target;
        float loss;

    public:
        Net() {};
        ~Net() {};

        //Initialize net:genetate weights matrices、layer matrices and bias matrices
        // bias default all zero
        void initNet(std::vector<int> layer_neuron_num_);

        //Initialise the weights matrices.
        void initWeights(int type = 0, double a = 0., double b = 0.1);

        //Initialise the bias matrices.
        void initBias(cv::Scalar& bias);

        //Forward
        void forward();

        //Forward
        void backward();

        //Train,use loss_threshold
        void train(cv::Mat input, cv::Mat target_, float loss_threshold, bool draw_loss_curve = false);        //Test
        void test(cv::Mat &input, cv::Mat &target_);

        //Predict,just one sample
        int predict_one(cv::Mat &input);

        //Predict,more  than one samples
        std::vector<int> predict(cv::Mat &input);

        //Save model;
        void save(std::string filename);

        //Load model;
        void load(std::string filename);

    protected:
        //initialise the weight matrix.if type =0,Gaussian.else uniform.
        void initWeight(cv::Mat &dst, int type, double a, double b);

        //Activation function
        cv::Mat activationFunction(cv::Mat &x, std::string func_type);

        //Compute delta error
        void deltaError();

        //Update weights
        void updateWeights();
    };

可以看到已经有了训练的函数train()、测试的函数test()，还有实际应用训练好的模型的predict()函数，以及保存和加载模型的函数save()和load()。大部分成员变量和成员函数应该还是能够通过名字就能够知道其功能的。

训练

训练函数train()

本文重点说的是训练函数train()和测试函数test()。这两个函数接受输入（input）和标签（或称为目标值target）作为输入参数。其中训练函数还要接受一个阈值作为迭代终止条件，最后一个函数可以暂时忽略不计，那是选择要不要把loss值实时画出来的标识。

训练的过程如下：

接受一个样本（即一个单列矩阵）作为输入，也即神经网络的第一层；
进行前向传播，也即forward()函数做的事情。然后计算loss；
如果loss值小于设定的阈值loss_threshold，则进行反向传播更新阈值；
重复以上过程直到loss小于等于设定的阈值。

train函数的实现如下：

//Train,use loss_threshold
    void Net::train(cv::Mat input, cv::Mat target_, float loss_threshold, bool draw_loss_curve)
    {
        if (input.empty())
        {
            std::cout << "Input is empty!" << std::endl;
            return;
        }

        std::cout << "Train,begain!" << std::endl;

        cv::Mat sample;
        if (input.rows == (layer[0].rows) && input.cols == 1)
        {
            target = target_;
            sample = input;
            layer[0] = sample;
            forward();
            //backward();
            int num_of_train = 0;
            while (loss > loss_threshold)
            {
                backward();
                forward();
                num_of_train++;
                if (num_of_train % 500 == 0)
                {
                    std::cout << "Train " << num_of_train << " times" << std::endl;
                    std::cout << "Loss: " << loss << std::endl;
                }
            }
            std::cout << std::endl << "Train " << num_of_train << " times" << std::endl;
            std::cout << "Loss: " << loss << std::endl;
            std::cout << "Train sucessfully!" << std::endl;
        }
        else if (input.rows == (layer[0].rows) && input.cols > 1)
        {
            double batch_loss = loss_threshold + 0.01;
            int epoch = 0;
            while (batch_loss > loss_threshold)
            {
                batch_loss = 0.;
                for (int i = 0; i < input.cols; ++i)
                {
                    target = target_.col(i);
                    sample = input.col(i);
                    layer[0] = sample;

                    farward();
                    backward();

                    batch_loss += loss;
                }

                loss_vec.push_back(batch_loss);

                if (loss_vec.size() >= 2 && draw_loss_curve)
                {
                    draw_curve(board, loss_vec);
                }
                epoch++;
                if (epoch % output_interval == 0)
                {
                    std::cout << "Number of epoch: " << epoch << std::endl;
                    std::cout << "Loss sum: " << batch_loss << std::endl;
                }
                if (epoch % 100 == 0)
                {
                    learning_rate *= fine_tune_factor;
                }
            }
            std::cout << std::endl << "Number of epoch: " << epoch << std::endl;
            std::cout << "Loss sum: " << batch_loss << std::endl;
            std::cout << "Train sucessfully!" << std::endl;
        }
        else
        {
            std::cout << "Rows of input don't cv::Match the number of input!" << std::endl;
        }
    }

这里考虑到了用单个样本和多个样本迭代训练两种情况。而且还有另一种不用loss阈值作为迭代终止条件，而是用正确率的train()函数，内容大致相同，此处略去不表。

在经过train()函数的训练之后，就可以得到一个模型了。所谓模型，可以简单的认为就是权值矩阵。简单的说，可以把神经网络当成一个超级函数组合，我们姑且认为这个超级函数就是y = f(x) = ax +ｂ。那么权值就是ａ和ｂ。反向传播的过程是把ａ和ｂ当成自变量来处理的，不断调整以得到最优值或逼近最优值。在完成反向传播之后，训练得到了参数ａ和ｂ的最优值，是一个固定值了。这时自变量又变回了ｘ。我们希望ａ、ｂ最优值作为已知参数的情况下，对于我们的输入样本ｘ，通过神经网络计算得到的结果ｙ，与实际结果相符合是大概率事件。

测试

测试函数test()

test()函数的作用就是用一组训练时没用到的样本，对训练得到的模型进行测试，把通过这个模型得到的结果与实际想要的结果进行比较，看正确来说到底是多少，我们希望正确率越多越好。

test()的步骤大致如下几步：

用一组样本逐个输入神经网络；
通过前向传播得到一个输出值；
比较实际输出与理想输出，计算正确率。

test()函数的实现如下：

//Test
    void Net::test(cv::Mat &input, cv::Mat &target_)
    {
        if (input.empty())
        {
            std::cout << "Input is empty!" << std::endl;
            return;
        }
        std::cout << std::endl << "Predict,begain!" << std::endl;

        if (input.rows == (layer[0].rows) && input.cols == 1)
        {
            int predict_number = predict_one(input);

            cv::Point target_maxLoc;
            minMaxLoc(target_, NULL, NULL, NULL, &target_maxLoc, cv::noArray());        
            int target_number = target_maxLoc.y;

            std::cout << "Predict: " << predict_number << std::endl;
            std::cout << "Target:  " << target_number << std::endl;
            std::cout << "Loss: " << loss << std::endl;
        }
        else if (input.rows == (layer[0].rows) && input.cols > 1)
        {
            double loss_sum = 0;
            int right_num = 0;
            cv::Mat sample;
            for (int i = 0; i < input.cols; ++i)
            {
                sample = input.col(i);
                int predict_number = predict_one(sample);
                loss_sum += loss;

                target = target_.col(i);
                cv::Point target_maxLoc;
                minMaxLoc(target, NULL, NULL, NULL, &target_maxLoc, cv::noArray());
                int target_number = target_maxLoc.y;

                std::cout << "Test sample: " << i << "   " << "Predict: " << predict_number << std::endl;
                std::cout << "Test sample: " << i << "   " << "Target:  " << target_number << std::endl << std::endl;
                if (predict_number == target_number)
                {
                    right_num++;
                }
            }
            accuracy = (double)right_num / input.cols;
            std::cout << "Loss sum: " << loss_sum << std::endl;
            std::cout << "accuracy: " << accuracy << std::endl;
        }
        else
        {
            std::cout << "Rows of input don't cv::Match the number of input!" << std::endl;
            return;
        }
    }

这里在进行前向传播的时候不是直接调用forward()函数，而是调用了predict_one()函数，predict函数的作用是给定一个输入，给出想要的输出值。其中包含了对forward()函数的调用。还有就是对于神经网络的输出进行解析，转换成看起来比较方便的数值。

源码链接

所有的代码都已经托管在Github上面，感兴趣的可以去下载查看。源码链接地址为

https://github.com/LiuXiaolong19920720/simple_net

学如逆水行舟，不进则退

一百个Chocolate

发布了470 篇原创文章 · 获赞 1150 · 访问量 17万+

私信关注