First, the neural network algorithm is actually a combination of a lot of knowledge points, on cnn (product volume neural networks) you need to know:
Here are two knowledge points (machine learning-based Andrew Ng's) I summarized before
Cost function:
Cost function
cost function (Cost Function) is defined in the entire training set on, all the mean errors of the samples , i.e. the average loss function .
See specific understanding of my blog:
https://blog.csdn.net/qq_40594554/article/details/97389489
Gradient descent:
Gradient descent generally explain univariate gradient descent, but commonly used in multivariate gradient descent procedure
Univariate gradient descent we can understand the method of least squares, multivariate gradient descent is based on the knowledge evolved, explain in detail you can see my blog:
https://blog.csdn.net/qq_40594554/article/details/97391667
Here, we have the basis of the above knowledge, understand the depth of the neural network inside the product volume neural network:
You can see I understand these two neural network-related knowledge
Neural Networks: https://blog.csdn.net/qq_40594554/article/details/97399935
Getting neural networks: https://blog.csdn.net/qq_40594554/article/details/97617154
A neural network process substantially as follows:
For example: we have three data: x1, x2, x3, we input to this neural network, we, we call the input layer L1, L2 for the hidden layer, L3 of the output layer
Our simple words described : namely our input layer L1 with three data, by three neural units a1, a2, a3, wherein x1, x2, x3 occupy different weight values a1,, x1, x2, x3 and a2 in the possession of different weights, x1, x2, x3 occupy different weights in the a3
Cost function optimal solution, namely optimum weights, the conclusion that a1, a2, a3 value, and a1, a2, a3 and through L3 of the nerve cell, by weight calculated output, which it is one of the most simple neural network, if not quite understand, you can see my blog above.
Now I tensorflow achieve common neural networks: that is, to determine the weight by several layers of hidden layer.
The above calculation related neurons, similar to a black clip, calculations show that we are not the last to know is the result, in the traditional neural network computing in general computing neurons are very large, so the traditional neural network can not meet , prone computationally intensive, but the possibility of low accuracy, so there is a product roll neural network!
The figure above, we need to calculate the weights very much, we need a lot of training samples, build our model needs to be established based on the size of the data, to prevent over-fitting, and less fit.
Thus, CNN algorithm receptive field and share weights reduce the number of parameters need to train the neural network, as shown in (you can Baidu understand what this means):
In fact, the product volumes neural network is easy to understand:
His simplest relevant steps of: winding layer product - cell layer - layer accumulated volume - cell layer fully connected layer -....-
Plot wound layer: data through a convolution kernel, according to the convolution kernel step, step by step scanning, a new feature map obtained
Pool layer: characterized by nuclear scan convolution FIG derived, for pooling, the pooling layer is actually a means of strengthening feature of FIG.
Convolution layer (suggest that you can Baidu What is the convolution kernel):
Cell layer (commonly used with maximum pooled, pooled mean):
In fact, mainly to acquire and enhance features by means of the above
As fully connected layers, that is, by pooling the results, by the excitation function, exclude undesirable characteristic diagram, and output:
I give here a source on handwritten numeral recognition:
Import tensorflow AS TF # 9.50 from tensorflow.examples.tutorials.mnist Import input_data # load data sets MNIST = input_data.read_data_sets ( " MNNIST_data " , one_hot = True) # download online data set # Print (MNIST) # each batch times the size, each put in 100 pictures into neural network training. = 100 batch_size # calculate a total number of batches n_bach = mnist.train.num_examples batch_size // # // divisible # initialize the weights DEF weight_variable (the Shape): inital = tf.truncated_normal (the Shape, STDDEV = 0.1 ) return tf.Variable (inital) # initialization offset value DEF bias_variable (Shape): Initial = tf.constant (0.1, Shape = Shape) return tf.Variable (Initial) # convolutional layer DEF conv2d (X, W is): return TF .nn.conv2d (X, W is, Strides = [1,1,1,1], padding = ' SAME ' ) # use of this library, tf.nn.conv2d # pooled layer DEF max_pool_2 (X): return TF .nn.max_pool (X, ksize = [1,2,2,1], Strides = [1,2,2,1], padding = ' SAME ' ) # define two placeholder X = tf.placeholder (TF. float32, [None, 784])# 784 Y = tf.placeholder (tf.float32, [None, 10]) # 0-9,10 numbers # change the format of the converted 4d vector x [BATCH, in_height, in_width, in_channels] x_image = tf.reshape (X, [- 1,28,28,1 ]) # initialize the first layer convolution weights and offset values W_conv1 weight_variable = ([5,5,1,32]) # using samples of 5 * 5 window, 32 drawn from a convolution kernel plane wherein b_conv1 bias_variable = ([32]) # nuclear each convolution of a bias # the weight vector x_image and convoluted, plus the offset value, and then activation function applied relu h_conv1 = tf.nn.relu (conv2d (x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2 (h_conv1) # for-Pooling max # initialize the second convolutional layer weight and the value of the offset value W_conv2 = weight_variable ([5,5,32,64]) #Using the sampling window 5 * 5, 32 convolution kernel extract features from a plane b_conv2 bias_variable = ([64]) # of each convolution kernel an offset value # the vectors x_image and right convolution plus the offset value, and then applied to the activation function relu h_conv2 = tf.nn.relu (conv2d (h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2 (h_conv2) # for-Pooling max # right initialize the first full connectivity value W_fcl weight_variable = ([7 * 7 * 64,1024]) # a 7 * 7 * 64 neurons, fully connected neural layer 1024 b_fcl bias_variable = ([1024]) # 1024 node # to pool an output layer of 2-dimensional flat one dimensional h_pool2_flat = tf.reshape (h_pool2, [- l, 7 * * 64. 7 ]) # output a first full-seeking connection h_fcl = tf.nn.relu (tf.matmul ( h_pool2_flat, W_fcl) + b_fcl) # keep_prob represents the output probability neurons = keep_prob tf.placeholder (tf.float32) h_fcl_drop = tf.nn.dropout (h_fcl, keep_prob) # initialization of the second full connection layer W_fc2 weight_variable = ([1024,10 ]) b_fc2 = bias_variable ([10 ]) # Calculation output Prediction = tf.nn.softmax (tf.matmul (h_fcl_drop, W_fc2) + b_fc2) # cross entropy cost function cross_entropy = tf.reduce_mean (tf.nn.softmax_cross_entropy_with_logits (= Y Labels, logits = Prediction)) # used for Adamoptimizer optimization train_step tf.train.AdamOptimizer = (. 4-1E ) .minimize (cross_entropy) # store the result in a Boolean list correct_prediction = tf.equal (tf.argmax (prediction, 1), tf.argmax (y, 1)) #argmax returns a location-dimensional tensor maximum # seeking accuracy Accuracy = tf.reduce_mean (tf.cast (correct_prediction, tf.float32)) with tf.Session () AS Sess: sess.run (tf.global_variables_initializer ()) # all the training images 21 times for Epoch in the Range (21 ): # once, that is, all the pictures of the first cycle of the training set for BATCH in the Range (n_bach): # get 100 pictures, picture data stored in _xs, label stored in YS batch_xs, batchys = mnist.train.next_batch (the batch_size) sess.run (train_step, feed_dict = {X: batch_xs, Y: batchys, keep_prob: 0.7 }) # pass in the test set, the data set of data acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0}) print("第"+str(epoch)+"准确率:"+str(acc))
It is easy for the first time can be drawn high accuracy rate:
I hope you can understand the process cnn algorithm. My previous blog post also above, you can see.