Deep learning - Introduction to Neural Networks

  1. Creating a machine learning algorithm means building a model that output correct information. Think of this model as a black box we feed input and it delivers an output.
  2. Training is a central concept in machine learning as this is the process through which the model learns how to make sense of the input data. Once we have trained out model we can simply feed it with data and obtain an output.
  3. The basic logic behind training and algorithm involves four ingredients: Data, Model, Objective function, and optimization algorithm.
  4. The training process is essentially a trial-and-error process, but each consequnt trial is better than the previous one, as we have methods in place that give feedback to the algorithm.
  5. There are three major types of machine learning: Supervised, Unsupervised, Reinforcement.
  6. Supervised learning refers to the case where we provide the algorithm with inputs and their corresponding desired outputs. Based on this information, it learns how to produce outputs as close to the ones we are looking for.
  7. In unsupervised learning, we feed inputs but there are no target outputs. This means we don’t tell the algorithm exactly what our goal is. Instead we ask it to find some sort of dependence or underlying logic in the data provided.
  8. Reinforcement, without digging too deep into it with reinforcement learning we would treat a model to act in an environment based on the rewards it receives.
  9. Supervised learning could be divided into additional subtypes classification and regression. The classification supervised learning models provide outputs which are categories such as cats or dogs. In regression supervised learning models, the output will be of numerical type for instance predicting the Eurodollar exchange rate will always give us a continuous number like 1.1 or 1.19.
  10. The linear model display appearing over simplified it is extremely important as it is the basis for more complicated models including non-linear ones. In the linear model, universe f of x is x times w plus b.
  11. There are many ways to define the linear model.
  12. The objective function: is the measure used to evaluate how well the model’s outputs match the desired correct values.
  13. The objective function is splited into two function: Loss function (Cost function) and Reward function.
  14. The lower the last function, the higher the level of accuracy of the model.
  15. The higher the reward function, the higher the level of accuracy of the mode. Usually reward functions are used in reinforcement learning where the goal is to maximize a specific result.
  16. Loss function is the sueprvised learning. The reward function is the reinforcement learning.
  17. Diverted supervised learning into two types, regression and classification.
  18. The target: denoted by T. The target is essentially the desired value at which we are aiming. Generally we want our output y to be as close as possible to the target T.
  19. The loss function is devided into two parts: Regression and classification.
  20. The outputs of a regression are continuous numbers. A commonly used last function is the squared loss also called L2 norm loss in the machine learning realm. The method for calculating it equals the least squares method used in statistics. The sum of the square differences between the output values y and the targets T. The lower this sum is the lower the error of prediction. Therefore the lower the cost function.
  21. The targets are the labels, so they are always. We are trying to obtain outputs, which are closet to the targets, so it is more correct to say that the objective function measures how well the output match the targets.
  22. Any function that holds the basic property: Higher for worse results, lower for better results, can be a loss function.
  23. Cross-entropy loss is used for classification.
  24. Oscillation: a repeative variation around a central value.
  25. Generally, we want the learning rate to be: High enough: so we can reach the closet minimum in a rational amount of time. Low enough: so we don’t oscillate around the minimum.
  26. Gradient descent: 1. we can find the minimum by trial and error. 2. Each trial is better than the previous one. 3. Learning rate should be high enough so we don’t iterate forever, but low enough so we don’t oscillate forever. 4. Once we have converged we should stop updating or as we will see in the coding example we should break the loop.
  27. The gradient is a generalization of the derivative concept.
  28. The gradient descent is a type of optimization algorithm.
  29. The N-parameter gradient descent updates many weights and biases. The 1-parameter gradient descent still could have many inputs, outputs and targets, but related to a single weight.

猜你喜欢

转载自blog.csdn.net/BSCHN123/article/details/103755884