python- product volume neural network to achieve a comprehensive understanding -tensorflow Digital Recognition

    First, the neural network algorithm is actually a combination of a lot of knowledge points, on cnn (product volume neural networks) you need to know:

          Here are two knowledge points (machine learning-based Andrew Ng's) I summarized before

          Cost function:

          Cost function
          cost function (Cost Function) is defined in the entire training set on, all the mean errors of the samples , i.e. the average loss function .

          See specific understanding of my blog:

          https://blog.csdn.net/qq_40594554/article/details/97389489

 

          Gradient descent:

          Gradient descent generally explain univariate gradient descent, but commonly used in multivariate gradient descent procedure

          Univariate gradient descent we can understand the method of least squares, multivariate gradient descent is based on the knowledge evolved, explain in detail you can see my blog:

          https://blog.csdn.net/qq_40594554/article/details/97391667

 

    Here, we have the basis of the above knowledge, understand the depth of the neural network inside the product volume neural network:

    You can see I understand these two neural network-related knowledge

    Neural Networks: https://blog.csdn.net/qq_40594554/article/details/97399935

    Getting neural networks: https://blog.csdn.net/qq_40594554/article/details/97617154

 

    A neural network process substantially as follows:

 

     For example: we have three data: x1, x2, x3, we input to this neural network, we, we call the input layer L1, L2 for the hidden layer, L3 of the output layer

        Our simple words described : namely our input layer L1 with three data, by three neural units a1, a2, a3, wherein x1, x2, x3 occupy different weight values a1,, x1, x2, x3 and a2 in the possession of different weights, x1, x2, x3 occupy different weights in the a3

        Cost function optimal solution, namely optimum weights, the conclusion that a1, a2, a3 value, and a1, a2, a3 and through L3 of the nerve cell, by weight calculated output, which it is one of the most simple neural network, if not quite understand, you can see my blog above.

    Now I tensorflow achieve common neural networks: that is, to determine the weight by several layers of hidden layer.

 

    The above calculation related neurons, similar to a black clip, calculations show that we are not the last to know is the result, in the traditional neural network computing in general computing neurons are very large, so the traditional neural network can not meet , prone computationally intensive, but the possibility of low accuracy, so there is a product roll neural network!

    

 

 

 

    

    The figure above, we need to calculate the weights very much, we need a lot of training samples, build our model needs to be established based on the size of the data, to prevent over-fitting, and less fit.

    Thus, CNN algorithm receptive field and share weights reduce the number of parameters need to train the neural network, as shown in (you can Baidu understand what this means):

 

 

    In fact, the product volumes neural network is easy to understand:

      His simplest relevant steps of: winding layer product - cell layer - layer accumulated volume - cell layer fully connected layer -....-

      Plot wound layer: data through a convolution kernel, according to the convolution kernel step, step by step scanning, a new feature map obtained

      Pool layer: characterized by nuclear scan convolution FIG derived, for pooling, the pooling layer is actually a means of strengthening feature of FIG.

 

    Convolution layer (suggest that you can Baidu What is the convolution kernel):  

    

 

    Cell layer (commonly used with maximum pooled, pooled mean):

 

 

    In fact, mainly to acquire and enhance features by means of the above

    As fully connected layers, that is, by pooling the results, by the excitation function, exclude undesirable characteristic diagram, and output:

    I give here a source on handwritten numeral recognition:

    

Import   tensorflow AS TF # 9.50 
from tensorflow.examples.tutorials.mnist Import input_data
 # load data sets 
MNIST = input_data.read_data_sets ( " MNNIST_data " , one_hot = True) # download online data set 
# Print (MNIST) 
# each batch times the size, each put in 100 pictures into neural network training. 
= 100 batch_size # calculate a total number of batches 
n_bach = mnist.train.num_examples batch_size // # // divisible 
# initialize the weights DEF weight_variable (the Shape): 
    inital = tf.truncated_normal (the Shape, STDDEV = 0.1 )
     return

tf.Variable (inital)
 # initialization offset value 
DEF bias_variable (Shape): 
    Initial = tf.constant (0.1, Shape = Shape)
     return tf.Variable (Initial)
 # convolutional layer 
DEF conv2d (X, W is):
     return   TF .nn.conv2d (X, W is, Strides = [1,1,1,1], padding = ' SAME ' )
     # use of this library, tf.nn.conv2d 
# pooled layer 
DEF max_pool_2 (X):
     return TF .nn.max_pool (X, ksize = [1,2,2,1], Strides = [1,2,2,1], padding = ' SAME ' )
 # define two placeholder 
X = tf.placeholder (TF. float32, [None, 784])# 784 
Y = tf.placeholder (tf.float32, [None, 10]) # 0-9,10 numbers 
# change the format of the converted 4d vector x [BATCH, in_height, in_width, in_channels] 
x_image = tf.reshape (X, [- 1,28,28,1 ]) 

# initialize the first layer convolution weights and offset values 
W_conv1 weight_variable = ([5,5,1,32]) # using samples of 5 * 5 window, 32 drawn from a convolution kernel plane wherein 
b_conv1 bias_variable = ([32]) # nuclear each convolution of a bias 
# the weight vector x_image and convoluted, plus the offset value, and then activation function applied relu 
h_conv1 = tf.nn.relu (conv2d (x_image, W_conv1) + b_conv1) 
h_pool1 = max_pool_2 (h_conv1) # for-Pooling max 

# initialize the second convolutional layer weight and the value of the offset value 
W_conv2 = weight_variable ([5,5,32,64]) #Using the sampling window 5 * 5, 32 convolution kernel extract features from a plane 
b_conv2 bias_variable = ([64]) # of each convolution kernel an offset value 
# the vectors x_image and right convolution plus the offset value, and then applied to the activation function relu 
h_conv2 = tf.nn.relu (conv2d (h_pool1, W_conv2) + b_conv2) 
h_pool2 = max_pool_2 (h_conv2) # for-Pooling max 

# right initialize the first full connectivity value 
W_fcl weight_variable = ([7 * 7 * 64,1024]) # a 7 * 7 * 64 neurons, fully connected neural layer 1024 
b_fcl bias_variable = ([1024]) # 1024 node 
# to pool an output layer of 2-dimensional flat one dimensional 
h_pool2_flat = tf.reshape (h_pool2, [- l, 7 * * 64. 7 ])
 # output a first full-seeking connection 
h_fcl = tf.nn.relu (tf.matmul ( h_pool2_flat, W_fcl) + b_fcl)
 # keep_prob represents the output probability neurons
= keep_prob tf.placeholder (tf.float32) 
h_fcl_drop = tf.nn.dropout (h_fcl, keep_prob) 

# initialization of the second full connection layer 
W_fc2 weight_variable = ([1024,10 ]) 
b_fc2 = bias_variable ([10 ])
 # Calculation output 
Prediction = tf.nn.softmax (tf.matmul (h_fcl_drop, W_fc2) + b_fc2)
 # cross entropy cost function 
cross_entropy = tf.reduce_mean (tf.nn.softmax_cross_entropy_with_logits (= Y Labels, logits = Prediction))
 # used for Adamoptimizer optimization 
train_step tf.train.AdamOptimizer = (. 4-1E ) .minimize (cross_entropy)
 # store the result in a Boolean list 
correct_prediction = tf.equal (tf.argmax (prediction, 1), tf.argmax (y, 1)) #argmax returns a location-dimensional tensor maximum 
# seeking accuracy 
Accuracy = tf.reduce_mean (tf.cast (correct_prediction, tf.float32)) 
with tf.Session () AS Sess: 
    sess.run (tf.global_variables_initializer ()) 
    # all the training images 21 times 
    for Epoch in the Range (21 ):
         # once, that is, all the pictures of the first cycle of the training set 
        for BATCH in the Range (n_bach):
             # get 100 pictures, picture data stored in _xs, label stored in YS 
            batch_xs, batchys = mnist.train.next_batch (the batch_size) 
            sess.run (train_step, feed_dict = {X: batch_xs, Y: batchys, keep_prob: 0.7 })
         # pass in the test set, the data set of data
        acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
        print(""+str(epoch)+"准确率:"+str(acc))

    It is easy for the first time can be drawn high accuracy rate:

 

     I hope you can understand the process cnn algorithm. My previous blog post also above, you can see.

  

 

 

  

 

 

    

 

Guess you like

Origin www.cnblogs.com/lh9527/p/9527-2.html