全连接网络和卷积网络实践

全连接网络和卷积网络实践

一、全连接网络实践

1.1 基于 MNist 数据集的认识、优化、生成数据集、到实现特定场景应用。

1.1.1 mnist 数据集及需求模型要求

  • mnist 数据集:包含 7 万张黑底白字手写数字图片,其中 55000 张为训练集,5000 张为验证集,10000 张为测试集。每张图片大小为 28*28 像素,图片中纯黑色像素值为 0,纯白色像素值为 1。数据集的标签是长度为 10 的一维数组,数组中每个元素索引号表示对应数字出现的概率。

  • 在将 mnist 数据集作为输入喂入神经网络时,需先将数据集中每张图片变为长度 784 一维数组,将该数组作为神经网络输入特征喂入神经网络。图片标签为 0 - 9 的 one—hot 编码,对应位置为 1,则说明该图片为该索引的数字。

  • 其中,可以通过下列操作进一步了解 mnist 数据集

    • 使用 input_data 模块中的 read_data_sets()函数加载 mnist 数据集:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(’ ./data/’,one_hot=True) 
  • 返回 mnist 数据集中训练集 train、验证集 validation 和测试集 test 样本数
①返回训练集 train 样本数
print “train data size:”,mnist.train.mun_examples
输出结果: train data size:55000

②返回验证集 validation 样本数
print “ validation data size:”,mnist.validation.mun_examples
输出结果: validation data size:5000

③返回测试集 test 样本数
print “test data size:”,mnist.test.mun_examples
输出结果: test data size:10000
  • 使用 train.labels 函数返回 mnist 数据集标签
mnist 数据集中,若想要查看训练集中第 0 张图片的标签,则使用如下函数
mnist.train.labels[0]
输出结果: array([0.,0.,0.,0.,0.,0.,1.,0.,0.,0])
  • 使用 train.images 函数返回 mnist 数据集图片像素值
在 mnist 数据集中,若想要查看训练集中第 0 张图片像素值, 则使用如下函数
mnist.train.images[0]
输出结果: array([0. ,0. ,0. ,
                0. ,0. ,0. ,
                0. ,0. ,0. ,
                … … …])
  • 使用 mnist.train.next_batch()函数将数据输入神经网络
BATCH_SIZE = 200  
xs,ys = mnist.train.next_batch(BATCH_SIZE)  
print “ xs shape:”,xs.shape
print “ ys shape:”,ys.shape

输出结果: xs.shape(200,784)
输出结果: ys.shape(200,10)

1.1.2 使用全连接网络处理 mnist 数据集原始版

  • 我们的程序第一版具有如下功能
    • 增加断点续训,这样就可以在保留上一次训练的结果的基础上,继续训练
    • 基于我们自己手写的图片,进行预测结果

前向传播代码 forward.py 如下:

# coding: utf-8 
import tensorflow as tf 

# 使用单隐层 500 个节点的神经网络,输入 28 * 28 像素的 784个像素点,进行的监督学习
INPUT_NODE = 784 
OUTPUT_NODE = 10 
LAYER_NODE = 500 

def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype=tf.float32) 
    if regularizer != None: tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w 

def get_bias(shape):
    b = tf.Variable(tf.zeros(shape))
    return b 

def forward(x,regularizer):
    w1 = get_weight([INPUT_NODE,LAYER_NODE],regularizer) 
    b1 = get_bias([LAYER_NODE]) 
    y1 = tf.nn.relu(tf.matmul(x,w1) + b1) 

    w2 = get_weight([LAYER_NODE,OUTPUT_NODE],regularizer) 
    b2 = get_bias([OUTPUT_NODE]) 
    y = tf.matmul(y1,w2) + b2 
    return y 

反向传播, backward.py 如下:

# coding: utf-8 
import os 
import tensorflow as tf 
import forward 
from tensorflow.examples.tutorials.mnist import input_data 

# params,参数 
STEPS = 40000
BATCH_SIZE = 200 
LEARNING_RATE_BASE = 0.1 
LEARNING_RATE_DECAY = 0.99 
REGULARIZER = 0.001 
MOVING_AVERAGE_DECAY = 0.99 
MODEL_PATH = './model/'
MODEL_NAME = 'mnist_model'


def backward(mnist):
    x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
    y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
    y = forward.forward(x,REGULARIZER) 
    # 迭代轮数,不可训练
    global_step = tf.Variable(0,trainable=False) 

    # 损失函数
    ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1)) 
    cem = tf.reduce_mean(ce) 
    loss = cem + tf.add_n(tf.get_collection('losses'))

    # 指数学习率
    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,mnist.train.num_examples / BATCH_SIZE,LEARNING_RATE_DECAY,staircase=True) 

    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step) 

    # 滑动平均
    ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step) 
    ema_op = ema.apply(tf.trainable_variables())

    # 将反向传播的 train_step 与 ema_op 关联起来
    with tf.control_dependencies([train_step,ema_op]):
        train_op = tf.no_op(name='train') 

    # 初始化一个带滑动平均的 saver 
    saver = tf.train.Saver() 

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer() 
        sess.run(init_op) 

        # 检查是否有断点,实现断点续训 
        ckpt = tf.train.get_checkpoint_state(MODEL_PATH) 
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess,ckpt.model_checkpoint_path)

        # 开始迭代训练 
        for i in range(STEPS):
            xs, ys = mnist.train.next_batch(BATCH_SIZE) 
            _, loss_value, step = sess.run([train_op, loss, global_step],feed_dict={x: xs, y_: ys}) 
            if i % 1000 == 0:
                print 'After %d steps,loss on training batch is %g'% (step,loss_value) 
                saver.save(sess,os.path.join(MODEL_PATH,MODEL_NAME),global_step=global_step) 


def main():
    mnist = input_data.read_data_sets('./data/',one_hot=True)
    backward(mnist)

if __name__ == '__main__':
    main()

测试模型准确率,测试程序如下:

#coding: utf-8 
import time 
import tensorflow as tf 
import forward 
import backward 
from tensorflow.examples.tutorials.mnist import input_data 
TEST_INTERAV_SEC = 5 

def test(mnist):

    with tf.Graph().as_default() as g: 
        x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
        y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
        y = forward.forward(x,None) 

        ema = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY) 
        ema_restore = ema.variables_to_restore() 
        saver = tf.train.Saver(ema_restore) 

        # 准确率
        correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

        # 反向传播训练是否结束
        TrainStopSignal = 0
        while True:
            with tf.Session() as sess:
                ckpt = tf.train.get_checkpoint_state(backward.MODEL_PATH)
                if ckpt and ckpt.model_checkpoint_path:
                    saver.restore(sess,ckpt.model_checkpoint_path)
                    global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] 
                    if global_step != TrainStopSignal:          # 如果没有结束则计算准确率,输出
                        accuracy_score = sess.run(accuracy,feed_dict={x:mnist.test.images,y_: mnist.test.labels})
                        print 'After %s training step,test accuracy is %g' %(global_step,accuracy_score)
                    else: 
                        print 'Training process has stoped'
                        return 
                else:
                    print 'No checkpoint file found'
                    return
                TrainStopSignal = global_step 
            time.sleep(TEST_INTERAV_SEC)


def main():
    mnist = input_data.read_data_sets('./data/',one_hot=True)
    test(mnist)

if __name__ == '__main__':
    main()

这里是应用程序,实现自定义手写的图片进行识别分类的作用

# coding: utf-8 
# 实现应用,输入一个图片,输出该数字是几
import numpy as np 
import tensorflow as tf 
from PIL import Image 
import forward 
import backward 

def restore_model(testPicArr):
    with tf.Graph().as_default() as tg:
        x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
        y = forward.forward(x,None) 
        preValue = tf.argmax(y,1)

        variable_averages = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY)
        variable_restore = variable_averages.variables_to_restore() 
        saver = tf.train.Saver(variable_restore)

        with tf.Session() as sess: 
            ckpt = tf.train.get_checkpoint_state(backward.MODEL_PATH) 
            if ckpt and ckpt.model_checkpoint_path:
                saver.restore(sess,ckpt.model_checkpoint_path)

                preValue = sess.run(preValue,feed_dict={x: testPicArr})
                return preValue 
            else:
                print 'No checkpoint file found'
                return -1 


def pre_pic(testName):
    img = Image.open(testName)
    reIm = img.resize((28,28),Image.ANTIALIAS) 
    im_arr = np.array(reIm.convert('L'))
    threadhold = 50 

    # 二值化
    for i in range(28):
        for j in range(28):
            im_arr[i][j] = 255 - im_arr[i][j] 
            if im_arr[i][j] < threadhold:
                im_arr[i][j] = 0 
            else: 
                im_arr[i][j] = 255  
    nm_arr = im_arr.reshape([1,784])
    nm_arr = nm_arr.astype(np.float) 
    img_ready = np.multiply(nm_arr,1.0/255.0)

    return img_ready


def application():
    testNum = input('input the number of test pictures')
    for i in range(testNum):
        testPic = raw_input('the path of test picture:')
        testPicArr = pre_pic(testPic) 
        preValue = restore_model(testPicArr)
        print 'the prediction number is ', preValue

def main():
    application()

if __name__ == '__main__':
    main()

1.1.3 继续 mnist 数据集应用的优化

  • 这里我们将在上面的程序中,加入以下优化
    • 制作数据集,实现特定应用,此时,除了上述的文件外,还需要增加生成数据的文件。

# 这里综合放置,前向、后向、测试程序,和应用程序,四个程序于此。
# 前向传播  forward.py 
# coding: utf-8 
import tensorflow as tf 

# 使用单隐层 500 个节点的神经网络,输入 28 * 28 像素的 784个像素点,进行的监督学习
INPUT_NODE = 784 
OUTPUT_NODE = 10 
LAYER_NODE = 500 

def get_weight(shape,regularizer):
    w = tf.Variable(tf.random_normal(shape),dtype=tf.float32) 
    if regularizer != None: tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w 

def get_bias(shape):
    b = tf.Variable(tf.zeros(shape))
    return b 

def forward(x,regularizer):
    w1 = get_weight([INPUT_NODE,LAYER_NODE],regularizer) 
    b1 = get_bias([LAYER_NODE]) 
    y1 = tf.nn.relu(tf.matmul(x,w1) + b1) 

    w2 = get_weight([LAYER_NODE,OUTPUT_NODE],regularizer) 
    b2 = get_bias([OUTPUT_NODE]) 
    y = tf.matmul(y1,w2) + b2 
    return y 

# 这里是反向传播 backward.py  
# coding: utf-8 
import os 
import tensorflow as tf 
import forward 
from tensorflow.examples.tutorials.mnist import input_data 
import generateds      # 导入生成数据

# params,参数 
STEPS = 40000
BATCH_SIZE = 200 
LEARNING_RATE_BASE = 0.1 
LEARNING_RATE_DECAY = 0.99 
REGULARIZER = 0.001 
MOVING_AVERAGE_DECAY = 0.99 
MODEL_PATH = './model/'
MODEL_NAME = 'mnist_model'
train_num_examples = 60000 

def backward(mnist):
    x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
    y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
    y = forward.forward(x,REGULARIZER) 
    # 迭代轮数,不可训练
    global_step = tf.Variable(0,trainable=False) 

    # 损失函数
    ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1)) 
    cem = tf.reduce_mean(ce) 
    loss = cem + tf.add_n(tf.get_collection('losses'))

    # 指数学习率
    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,mnist.train.num_examples / BATCH_SIZE,LEARNING_RATE_DECAY,staircase=True) 

    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step) 

    # 滑动平均
    ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step) 
    ema_op = ema.apply(tf.trainable_variables())

    # 将反向传播的 train_step 与 ema_op 关联起来
    with tf.control_dependencies([train_step,ema_op]):
        train_op = tf.no_op(name='train') 

    # 初始化一个带滑动平均的 saver 
    saver = tf.train.Saver() 
    img_batch, label_batch = generateds.get_tfrecord(BATCH_SIZE,isTrain=True) 

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer() 
        sess.run(init_op) 

        # 检查是否有断点,实现断点续训 
        ckpt = tf.train.get_checkpoint_state(MODEL_PATH) 
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess,ckpt.model_checkpoint_path)

        # 开启线程协调器
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess,coord=coord)

        # 开始迭代训练 
        for i in range(STEPS):
            # xs, ys = mnist.train.next_batch(BATCH_SIZE) 
            xs, ys = sess.run([img_batch,label_batch])
            _, loss_value, step = sess.run([train_op, loss, global_step],feed_dict={x: xs, y_: ys}) 
            if i % 1000 == 0:
                print 'After %d steps,loss on training batch is %g'% (step,loss_value) 
                saver.save(sess,os.path.join(MODEL_PATH,MODEL_NAME),global_step=global_step) 

        # 关闭线程协调器
        coord.request_stop()
        coord.join(threads)

def main():
    mnist = input_data.read_data_sets('./data/',one_hot=True)
    backward(mnist)

if __name__ == '__main__':
    main()


# 这里是测试程序 test.py 的代码
#coding: utf-8 
import time 
import tensorflow as tf 
import forward 
import backward 
import generateds
from tensorflow.examples.tutorials.mnist import input_data 
TEST_INTERAV_SEC = 5 
TEST_NUM = 10000

def test():

    with tf.Graph().as_default() as g: 
        x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
        y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
        y = forward.forward(x,None) 

        ema = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY) 
        ema_restore = ema.variables_to_restore() 
        saver = tf.train.Saver(ema_restore) 

        # 准确率
        correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

        img_batch, label_batch = generateds.get_tfrecord(TEST_NUM,isTrain=False)   

        # 反向传播训练是否结束
        TrainStopSignal = 0
        while True:
            with tf.Session() as sess:
                ckpt = tf.train.get_checkpoint_state(backward.MODEL_PATH)
                if ckpt and ckpt.model_checkpoint_path:
                    saver.restore(sess,ckpt.model_checkpoint_path)
                    global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] 

                    # 线程协调器
                    coord = tf.train.Coordinator()
                    threads = tf.train.start_queue_runners(sess=sess,coord=coord)

                    xs, ys = sess.run([img_batch,label_batch])

                    # accuracy_score = sess.run(accuracy,feed_dict={x:mnist.test.images,y_: mnist.test.labels})
                    accuracy_score = sess.run(accuracy,feed_dict={x: xs,y_: ys })
                    print 'After %s training step,test accuracy is %g' %(global_step,accuracy_score)

                    # 关闭线程协调器
                    coord.request_stop() 
                    coord.join(threads)

                else:
                    print 'No checkpoint file found'
                    return
            time.sleep(TEST_INTERAV_SEC)


def main():
    # mnist = input_data.read_data_sets('./data/',one_hot=True)
    # test(mnist)
    test()

if __name__ == '__main__':
    main()


# 这里是应用程序的代码 app.py 
# coding: utf-8 
# 实现应用,输入一个图片,输出该数字是几
import numpy as np 
import tensorflow as tf 
from PIL import Image 
import forward 
import backward 

def restore_model(testPicArr):
    with tf.Graph().as_default() as tg:
        x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
        y = forward.forward(x,None) 
        preValue = tf.argmax(y,1)

        variable_averages = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY)
        variable_restore = variable_averages.variables_to_restore() 
        saver = tf.train.Saver(variable_restore)

        with tf.Session() as sess: 
            ckpt = tf.train.get_checkpoint_state(backward.MODEL_PATH) 
            if ckpt and ckpt.model_checkpoint_path:
                saver.restore(sess,ckpt.model_checkpoint_path)

                preValue = sess.run(preValue,feed_dict={x: testPicArr})
                return preValue 
            else:
                print 'No checkpoint file found'
                return -1 


def pre_pic(testName):
    img = Image.open(testName)
    reIm = img.resize((28,28),Image.ANTIALIAS) 
    im_arr = np.array(reIm.convert('L'))
    threadhold = 50 

    # 二值化
    for i in range(28):
        for j in range(28):
            im_arr[i][j] = 255 - im_arr[i][j] 
            if im_arr[i][j] < threadhold:
                im_arr[i][j] = 0 
            else: 
                im_arr[i][j] = 255  
    nm_arr = im_arr.reshape([1,784])
    nm_arr = nm_arr.astype(np.float) 
    img_ready = np.multiply(nm_arr,1.0/255.0)

    return img_ready


def application():
    testNum = input('input the number of test pictures')
    for i in range(testNum):
        testPic = raw_input('the path of test picture:')
        testPicArr = pre_pic(testPic) 
        preValue = restore_model(testPicArr)
        print 'the prediction number is ', preValue

def main():
    application()

if __name__ == '__main__':
    main()

由于,我们此次优化的目的就是实现自制数据集,实现特定工程应用,故此处将生成数据的代码分开放置如下: generate.py

# coding: utf-8 
import os 
import numpy as np 
import tensorflow as tf 
from PIL import Image 

# params 
image_train_path = './mnist_data_jpg/mnist_train_jpg_60000/'
label_train_path = './mnist_data_jpg/mnist_train_jpg_60000.txt'
image_test_path = './mnist_data_jpg/mnist_test_jpg_10000/'
label_test_path = './mnist_data_jpg/mnist_test_jpg_10000.txt'
tfRecord_train = './data/mnist_train.tfrecords'
tfRecord_test = './data/mnist_test.tfrecords'
data_path = './data/'
resize_height = 28 
resize_width = 28

def write_tfRecord(tfRecordName,image_path,label_path):
    writer = tf.python_io.TFRecordWriter(tfRecordName) 
    num_pic = 0 
    f = open(label_path,'r')
    contents = f.readlines() 
    f.close()
    for content in contents:
        value = content.split() 
        img_path = image_path + value[0] 
        img = Image.open(img_path) 
        img_raw = img.tobytes()
        labels = [0] * 10
        labels[int(value[1])] = 1 

        example = tf.train.Example(features=tf.train.Features(feature={
            'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw])),
            'label': tf.train.Feature(int64_list=tf.train.Int64List(value=labels))
        }))

        writer.write(example.SerializeToString())
        num_pic += 1 
        print 'the number of picture:', num_pic 
    writer.close()
    print 'write tfRecord successful'


def generate_tfRecord():
    isExists = os.path.exists(data_path) 
    if not isExists:
        os.makedirs(data_path)
        print 'The directory was created successfully'
    else:
        print 'The directory already exists'

    write_tfRecord(tfRecord_train,image_train_path,label_train_path)
    write_tfRecord(tfRecord_test,image_test_path,label_test_path) 


def read_tfRecord(tfRecord_path):
    filename_queue = tf.train.string_input_producer([tfRecord_path],shuffle=True)
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(serialized_example,
                                        features={
                                            'label': tf.FixedLenFeature([10],tf.int64),
                                            'img_raw':tf.FixedLenFeature([],tf.string)
                                        })
    img = tf.decode_raw(features['img_raw'],tf.uint8)
    img.set_shape([784])
    img = tf.cast(img,tf.float32) * (1. / 255)
    label = tf.cast(features['label'],tf.float32)
    return img, label

def get_tfrecord(num,isTrain=True):
    if isTrain:
        tfRecord_path = tfRecord_train
    else:
        tfRecord_path = tfRecord_test

    img, label = read_tfRecord(tfRecord_path)

    img_batch, label_batch = tf.train.shuffle_batch([img,label],
                                                    batch_size =num,
                                                    num_threads = 2,
                                                    capacity = 1000,
                                                    min_after_dequeue = 700) 
    return img_batch, label_batch

def main():
    generate_tfRecord()

if __name__ == '__main__':
    main()

二、卷积神经网络的实践

2.1 基于Mnist数据集的 Lecun 5 实现

  • 我们,首先讲述便是 Lecun 5 的最为经典的卷积网络,解释如下:
  • Lenet 神经网络是 Yann LeCun 等人在 1998 年提出的,该神经网络充分考虑图像的相关性。由于,我们的数据集是 Mnist,故我们需要将 Lecun 5 的网络结构进行微调。我们先来介绍 Lecun 5 的网络结构,如下:

这里写图片描述



- Lenet 神经网络的输入是 32*32*1,经过 5*5*1 的卷积核,卷积核个数为 6 个,采用非全零填充方式,步长为 1,根据非全零填充计算公式:输出尺寸=(输入尺寸-卷积核尺寸+1) /步长=( 32-5+1) /1=28.故经过卷积后输出为 28*28*6。经过第一层池化层,池化大小为 2*2,全零填充,步长为 2,由全零填充计算公式:输出尺寸=输入尺寸/步长=28/2=14,池化层不改变深度,深度仍为 6。用同样计算方法,得到第二层池化后的输出为 5*5*16。将第二池化层后的输出拉直送入全连接层。


- 根据 Lenet 神经网络的结构可得, Lenet 神经网络具有如下特点:
①卷积( Conv)、池化( ave-pooling)、非线性激活函数( sigmoid) 相互交替;②层与层之间稀疏连接, 减少计算复杂度
- 接下来,我们对网络进行调整,以适应 Mnist 数据集

这里写图片描述

  • Lenet 神经网络在 Mnist 数据集上的实现,主要分为三个部分:前向传播过程( forward.py)、反向传播过程( backword.py)、测试过程( test.py)。代码实例如下:
# 前向传播  forward.py 
# coding: utf-8 
import tensorflow as tf 

# paraments 
IMAGE_SIZE = 28 
NUM_CHANNELS = 1 
CONV1_SIZE = 5 
CONV1_KERNEL_NUM = 32 
CONV2_SIZE = 5 
CONV2_KERNEL_NUM = 64
FC_SIZE = 512
OUTPUT_NODE = 10 

def get_weight(shape,regularizer):
    w = tf.Variable(tf.truncated_normal(shape,stddev=0.1))
    if regularizer != None: tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w)) 
    return w 

def get_bias(shape):
    b = tf.Variable(tf.zeros(shape))
    return b  

def conv2d(x,w):
    return tf.nn.conv2d(x,w,strides=[1,1,1,1],padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

def forward(x,train,regularizer):
    conv1_w = get_weight([CONV1_SIZE,CONV1_SIZE,NUM_CHANNELS,CONV1_KERNEL_NUM],regularizer) 
    conv1_b = get_bias([CONV1_KERNEL_NUM])
    conv1 = conv2d(x,conv1_w)   
    relu1 = tf.nn.relu(tf.nn.bias_add(conv1,conv1_b))
    pool1 = max_pool_2x2(relu1)

    conv2_w = get_weight([CONV2_SIZE,CONV2_SIZE,CONV1_KERNEL_NUM,CONV2_KERNEL_NUM],regularizer)
    conv2_b = get_bias([CONV2_KERNEL_NUM])
    conv2 = conv2d(pool1,conv2_w) 
    relu2 = tf.nn.relu(tf.nn.bias_add(conv2,conv2_b)) 
    pool2 = max_pool_2x2(relu2) 

    pool_shape = pool2.get_shape().as_list()
    # 其中,pool_shape 第一个为 batch_size ,后面三个为特征的 长度,宽度,深度
    nodes = pool_shape[1] * pool_shape[2] * pool_shape[3] 
    reshaped = tf.reshape(pool2,[pool_shape[0],nodes])

    # 下面是全连接 
    fc1_w = get_weight([nodes,FC_SIZE],regularizer)
    fc1_b = get_bias([FC_SIZE])
    fc1 = tf.nn.relu(tf.matmul(reshaped,fc1_w) + fc1_b)
    if train: fc1 = tf.nn.dropout(fc1,0.5) 

    fc2_w = get_weight([FC_SIZE,OUTPUT_NODE],regularizer)
    fc2_b = get_bias([OUTPUT_NODE])
    y = tf.matmul(fc1,fc2_w) + fc2_b
    return y 

# 反向传播  backward.py 
# coding: utf-8 
import os 
import numpy as np 
import forward 
import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data 

# paranents 
BATCH_SIZE = 100 
STEEPS = 50000 
LEARNING_RATE_BASE = 0.005 
LEARNING_RATE_DECAY = 0.99
REGULARIZER = 0.0001
MOVING_AVERAGE_DECAY = 0.99 
MODEL_SAVE_PATH = './model/'
MODEL_NAME = 'mnist_model'

def backward(mnist):
    x = tf.placeholder(tf.float32,[BATCH_SIZE,forward.IMAGE_SIZE,forward.IMAGE_SIZE,forward.NUM_CHANNELS])
    y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
    y = forward.forward(x,True,REGULARIZER) 
    global_step = tf.Variable(0,trainable=False)

    ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1))
    cem = tf.reduce_mean(ce) 
    loss = cem + tf.add_n(tf.get_collection('losses'))

    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,mnist.train.num_examples / BATCH_SIZE,LEARNING_RATE_DECAY,staircase=True) 

    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

    ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)
    ema_op = ema.apply(tf.trainable_variables())

    with tf.control_dependencies([train_step,ema_op]):
        train_op = tf.no_op(name='train')

    saver = tf.train.Saver()

    with tf.Session() as sess: 
        init_op = tf.global_variables_initializer()
        sess.run(init_op) 

        ckpt = tf.train.get_checkpoint_state(MODEL_SAVE_PATH) 
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess,ckpt.model_checkpoint_path)

        for i in range(STEEPS):
            xs, ys = mnist.train.next_batch(BATCH_SIZE)
            reshaped_xs = np.reshape(xs,(BATCH_SIZE,forward.IMAGE_SIZE,forward.IMAGE_SIZE,forward.NUM_CHANNELS))
            _, loss_value, step = sess.run([train_op, loss,global_step],feed_dict={x: reshaped_xs,y_: ys})
            if i % 100 == 0 :
                print 'After %d training steps,loss on training batch is %g'%(i,loss_value)
                saver.save(sess,os.path.join(MODEL_SAVE_PATH,MODEL_NAME),global_step=global_step)

def main():
    mnist = input_data.read_data_sets('./data/',one_hot=True) 
    backward(mnist)

if __name__ == '__main__':
    main()

# 测试程序 test.py 
# coding: utf-8 
import numpy as np 
import time 
import forward
import backward
import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data

TEST_INTERVAL_SEC = 5

def test(mnist):
    with tf.Graph().as_default() as g:
        x = tf.placeholder(tf.float32,[mnist.test.num_examples,forward.IMAGE_SIZE,forward.IMAGE_SIZE,forward.NUM_CHANNELS])
        y_ = tf.placeholder(tf.float32,[None,forward.OUTPUT_NODE])
        y = forward.forward(x,False,None) 

        ema = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY)  
        ema_restore = ema.variables_to_restore()
        saver = tf.train.Saver(ema_restore)

        correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

        while True:
            with tf.Session() as sess:
                ckpt = tf.train.get_checkpoint_state(backward.MODEL_SAVE_PATH)
                if ckpt and ckpt.model_checkpoint_path:
                    saver.restore(sess,ckpt.model_checkpoint_path)
                    global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] 
                    reshaped_x = np.reshape(mnist.test.images,(mnist.test.num_examples,forward.IMAGE_SIZE,forward.IMAGE_SIZE,forward.NUM_CHANNELS))
                    accuracy_score = sess.run(accuracy,feed_dict={x:reshaped_x,y_: mnist.test.labels})
                    print 'After %s steps,test accuracy is %g'%(global_step,accuracy_score)

                else:
                    print "No checkpoint file found"
                    return 
            time.sleep(TEST_INTERVAL_SEC)

def main():
    mnist = input_data.read_data_sets('./data/',one_hot=True)
    test(mnist)

if __name__ == '__main__':
    main() 

# 应用程序  app.py 
# coding: utf-8 
# 实现应用,输入一个图片,输出该数字是几
import numpy as np 
import tensorflow as tf 
from PIL import Image 
import forward 
import backward 

def restore_model(testPicArr):
    with tf.Graph().as_default() as tg:
        x = tf.placeholder(tf.float32,[None,forward.INPUT_NODE])
        y = forward.forward(x,None) 
        preValue = tf.argmax(y,1)

        variable_averages = tf.train.ExponentialMovingAverage(backward.MOVING_AVERAGE_DECAY)
        variable_restore = variable_averages.variables_to_restore() 
        saver = tf.train.Saver(variable_restore)

        with tf.Session() as sess: 
            ckpt = tf.train.get_checkpoint_state(backward.MODEL_PATH) 
            if ckpt and ckpt.model_checkpoint_path:
                saver.restore(sess,ckpt.model_checkpoint_path)

                preValue = sess.run(preValue,feed_dict={x: testPicArr})
                return preValue 
            else:
                print 'No checkpoint file found'
                return -1 


def pre_pic(testName):
    img = Image.open(testName)
    reIm = img.resize((28,28),Image.ANTIALIAS) 
    im_arr = np.array(reIm.convert('L'))
    threadhold = 50 

    # 二值化
    for i in range(28):
        for j in range(28):
            im_arr[i][j] = 255 - im_arr[i][j] 
            if im_arr[i][j] < threadhold:
                im_arr[i][j] = 0 
            else: 
                im_arr[i][j] = 255  
    nm_arr = im_arr.reshape([1,784])
    nm_arr = nm_arr.astype(np.float) 
    img_ready = np.multiply(nm_arr,1.0/255.0)

    return img_ready


def application():
    testNum = input('input the number of test pictures')
    for i in range(testNum):
        testPic = raw_input('the path of test picture:')
        testPicArr = pre_pic(testPic) 
        preValue = restore_model(testPicArr)
        print 'the prediction number is ', preValue

def main():
    application()

if __name__ == '__main__':
    main()

2.2 基于 CIFAR 10 数据集的图像分类,Keras落地

  • 首先简单介绍一下 CIFAR 10 的数据集,Cifar-10 是由 Hinton 的两个大弟子 Alex Krizhevsky、Ilya Sutskever 收集的一个用于普适物体识别的数据集。Cifar-10 由60000张32*32的 RGB 彩色图片构成,共10个分类。50000张训练,10000张测试(交叉验证)。这个数据集最大的特点在于将识别迁移到了普适物体,而且应用于多分类(姊妹数据集Cifar-100达到100类,ILSVRC比赛则是1000类)。
  • 关于 Cifar-10 的 python 数据集下载可在此处.
  • 接下来,就开始代码部分讲解:
%matplotlib inline
import glob           # glob是python自己带的一个文件操作相关模块
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.contrib import learn
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
from keras.utils.np_utils import to_categorical

datadir='data/cifar-10-batches-bin/'

plt.ion()
G = glob.glob (datadir + '*.bin')
A = np.fromfile(G[0],dtype=np.uint8).reshape([10000,3073])
labels = to_categorical(A [:,0])
images = A [:,1:].reshape([10000,3,32,32]).transpose (0,2,3,1)
print images.shape
plt.imshow(images[15])
print labels[11]
images_unroll = A [:,1:]
model = Sequential()

model.add(Convolution2D(16, 5, 5,
                        border_mode='valid',
                        input_shape=(32,32,3) ))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(16, 5, 5, border_mode='valid',
                        input_shape=(32,32,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])

model.fit(images[:8000], labels[:8000], batch_size=100, nb_epoch=300,
          verbose=1, validation_data=(images[8000:], labels[8000:]))
score = model.evaluate(images[8000:], labels[8000:], verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])

猜你喜欢

转载自blog.csdn.net/smilejiasmile/article/details/80793296