Connection is set manually, weight and bias is out of tune, a neural network is a function parameter set
deep refers to the many hidden layers
It can be represented as a matrix operation easy
------------
Handwritten numbers Example:
Dimensional vector input 256, the output of 10 dimensions, each representing a digital output probability of 9,0 ...... 1
Calculated loss is calculated cross-entropy loss of y and y ^
Find a θ * minimize the total loss
How to find the best parameter θ *, gradient descent θ *.