Tensorflow视频学习总结文档

Tensorflow学习中，在学习了基础知识和创建简单模型后，发现可以利用已有函数retrain.py直接采取一个Inception V3架构模型训练ImageNet图像和训练新的顶层。这个运用可以不需要基础就运用起来，如果想认识retrain.py怎么运作的，可以看源代码或者文章：https://blog.csdn.net/daydayup_668819/article/details/68060483 。

首先可以看一下Inception的作用，推荐文章：https://blog.csdn.net/u010402786/article/details/52433324 ，接下来分三个部分进行说明。

一．Gpu版本tensorflow

如果不用gpu运算的话，这一步也可以省略。

首先安装CUDA，网址如下：

https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=81&target_type=exelocal

安装有分电脑系统的，而且得与python的tensorflow相匹配，不匹配的话运行会有错误版本提示。我用的是windows下的，linux下安装这个驱动似乎挺困难，不太建议。下载后得到安装包，双击安装，然后把CUDA安装目录下的bin和lib\x64添加到Path环境变量中。默认安装目录是在C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA，

接下来安装cuDNN，对深度学习计算加速用的，网址为https://developer.nvidia.com/rdp/cudnn-download ，这个需要先注册才能下载。

cuDNN的版本需要与CUDA版本一致。下载后解压，把压缩包中bin,include,lib中的文件分别拷贝到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0 对应目录下，把C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\extras\CUPTI\libx64\cupti64拷贝到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin下。

然后安装tensorflow-gpu, pip uninstall tensorflow, pip install tensorflow-gpu。

二．利用已经创建好的模型进行微调得到自己的图片分类

使用这个微调技术只改变分类模型最后一层，前面的卷积层和池化层都固化了，训练速度快，迭代的周期少，需要使用的图片的数据量比较少。首先得下载tensorflow的文件，链接为https://github.com/tensorflow/tensorflow，我解压完是tensorflow-master文件，需要用到其中tensorflow-master\tensorflow\examples\image_retraining下的retrain.py文件，最好新创一个文件夹。创建批处理文件retrain.bat，代码如下：

python3 D:/Tensorflow/tensorflow-master/tensorflow/examples/image_retraining\retrain.py ^
--bottleneck_dir bottleneck ^
--how_many_training_steps 10000 ^
--model_dir D:/Tensorflow/inception_model/ ^
--output_graph output_graph.pb ^
--output_labels output_labels.txt ^
--image_dir data/train/
pause

其中，python3根据情况得改成python，后面的路径为retrain.py 的路径，bottleneck文件指对每张图片计算一个值出来，那个值是使用已有模型计算的，5000为训练次数，默认似乎是4000，但训练到后面次数再多准确率也无法提升的了。需要自己先创建好bottleneck文件，自动创建好一个pb图文件和一个txt文件作为标签文件，data/train/中存放要训练的图片，如下创建：

每个文件夹是一种分类，每种分类的图片不能太少，至少也要30张左右，然后直接开retrain.bat运行即可，每训练十次会打印出当前准确率和准确率的可信度。程序自动创建两个文件，一个pb文件和一个txt文件。

之后便可以直接在代码中调用文件进行分类即可。代码如下：

import tensorflow as tf
import os
import numpy as np
import re
from PIL import Image
import matplotlib.pyplot as plt
import pandas as pd
import pylab

'''
  对已经训练好的模型进行微调
  得到的文件进行调用
'''

save_path = 'C:/Users/lin/Desktop'

lines = tf.gfile.GFile('D:/Tensorflow/retrain/output_labels.txt').readlines()
uid_to_human = {}
#一行一行读取
for uid,line in enumerate(lines):
    line = line.strip('\n')
    uid_to_human[uid] = line

def id_to_string(node_id):
    if node_id not in uid_to_human:
        return ''
    return uid_to_human[node_id]

#创建一个图来存放google训练好的模型
with tf.gfile.FastGFile('D:/Tensorflow/retrain/output_graph.pb', 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    tf.import_graph_def(graph_def, name='')

with tf.Session() as sess:
    softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
    #遍历目录
    i = 1000
    for root,dirs,files in os.walk('D:/Tensorflow/retrain/image'):
        for file in files:
            # 载入图片
            image_data = tf.gfile.FastGFile(os.path.join(root, file), 'rb').read() #读,read()必下,以rb打开，因为默认是utf-8解码图片，改为rb就好
            predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0':image_data})#图片格式jpg
            predictions = np.squeeze(predictions) #吧结果转为1维数据

            #打印图片路径和名称
            image_path = os.path.join(root, file)
            print(image_path)
            #排序
            top_k = predictions.argsort()[-5:][::-1] #打印后面五个，即概率最大的
            # for node_id in top_k:
            #     #获取分类名称
            #     humam_string = id_to_string(node_id)
            #     #获取该分类的置信度
            #     # print(node_id)
            #     score = predictions[node_id]
            #     print('%s (score = %.5f)' % (humam_string, score))
            # print()
            # 显示图片
            img = Image.open(image_path)
            # img = np.array(img)
            # plt.imshow(img, cmap = 'gray')
            # plt.axis('off')
            # plt.show()
            paths = '%s/%s.jpg' % (top_k[0], i)
            i += 1
            img.save(os.path.join(save_path,paths))

三．从头开始训练新的模型

从头训练，迭代周期长，需要数据量大，可能需要几十万到几百万张图片进行训练，才能有个好的结果。图片太少会有严重的过拟合现象。首先，https://github.com/tensorflow/models 下载，其中有个slim模型，可以将其放在D:/Tensorflow/ 下，要从头训练图片模型的话，首先要对图片进行处理，将图片转化为tfrecord格式，底层为protobuf格式，后面需要调用到这些文件进行训练、生成tfrecord文件如下：

import tensorflow as tf
import os
import random
import math
import sys

#验证集数量
_NUM_TEST = 50
#随机种子
_RANDOM_SEED = 0
#数据块
_NUM_SHARDS = 5
#数据集路径
DATASET_DIR = 'D:/Tensorflow/slim/images/'
#标签文件名字
LABELS_FILENAME = 'D:/Tensorflow/slim/images/labels.txt'

#protobuf数据存储的格式，谷歌开源，速度快，图片数据先转为protobuf格式
#定义tfrecord文件的路径加名字
def _get_dataset_filename(dataset_dir, split_name, shard_id):
    output_filename = 'image_%s_%05d-of-%05d.tfrecord' % (split_name,shard_id, _NUM_SHARDS)
    return os.path.join(dataset_dir, output_filename)

#判断tfrecord文件是否存在
def _dataset_exists(dataset_dir):
    for split_name in ['train', 'test']:
        for shard_id in range(_NUM_SHARDS):
            #定义tfrecord文件的路径+名字
            output_filename = _get_dataset_filename(dataset_dir, split_name,shard_id)
        if not tf.gfile.Exists(output_filename):
            return False
        return True

#获取所有文件以及分类
def _get_filename_and_classes(dataset_dir):
    #数据目录
    directories = []
    #分类名称
    class_names = []
    for filename in os.listdir(dataset_dir):
        #合并文件路径
        path = os.path.join(dataset_dir,filename)
        #判断该路径是否为目录
        if os.path.isdir(path):
            #加入数据目录
            directories.append(path)
            #加入类别名称
            class_names.append(filename)

    photo_filenames = []
    print(directories)
    #循环每个分类的文件夹
    for directory in directories:
        print(directory)
        for filename in os.listdir(directory):
            path = os.path.join(directory,filename)
            #把图片加入图片列表
            photo_filenames.append(path)

    return photo_filenames, class_names

def int64_feature(values):
    if not isinstance(values, (tuple, list)):
        values = [values]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=values))

def bytes_feature(values):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values]))

def image_to_tfexample(image_data,image_format,class_id):
    #Abstract base class for protocol message.
    return tf.train.Example(features=tf.train.Features(feature={
        'image/encoded':bytes_feature(image_data),
        'image/format':bytes_feature(image_format),
        'image/class/label':int64_feature(class_id),
    }))

def write_label_file(labels_to_class_names, dataset_dir,filename=LABELS_FILENAME):
    labels_filename = os.path.join(dataset_dir,filename)
    with tf.gfile.Open(labels_filename, 'w') as f:
        for label in labels_to_class_names:
            class_name = labels_to_class_names[label]
            f.write('%d:%s\n' % (label, class_name))

#把数据转为TFRecord格式
def _convert_dataset(split_name, filenames, class_names_to_ids, dataset_dir):
    assert split_name in ['train', 'test']
    #计算每个数据块有多少个数据
    #数据量大才用切分成多个tfrecord文件
    num_per_shard = int(len(filenames) / _NUM_SHARDS)
    with tf.Graph().as_default():
        with tf.Session() as sess:
            for shard_id in range(_NUM_SHARDS):
                #定义tfrecord文件的路径+名字
                output_filename = _get_dataset_filename(dataset_dir,split_name,shard_id)
                with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:  #固定套路
                    #每一个数据块开始的位置
                    start_ndx = shard_id * num_per_shard
                    #每一个数据块最后的位置
                    end_ndx = min((shard_id+1) * num_per_shard, len(filenames))
                    for i in range(start_ndx, end_ndx):
                        try:
                            sys.stdout.write('\r>>Converting image %d/%d shard %d'% (i+1,len(filenames),shard_id))
                            sys.stdout.flush()
                            #读取图片
                            image_data = tf.gfile.FastGFile(filenames[i],'rb').read()
                            #获得图片的类别名称
                            class_name = os.path.basename(os.path.dirname(filenames[i]))
                            #找到类别对应的id
                            class_id = class_names_to_ids[class_name]
                            #生成tfrecord文件
                            example = image_to_tfexample(image_data,b'.jpg',class_id)
                            tfrecord_writer.write(example.SerializeToString())
                        except IOError as e:
                            print('Could not read: ',filenames[i])
                            print("Error: ",e)
                            print("Skip it\n")
    sys.stdout.write('\n')
    sys.stdout.flush()
if __name__ == '__main__':
    #判断tfrecode文件是否存在
    if _dataset_exists(DATASET_DIR):
        print('tfrecord文件以存在')
    else:
        #获得图以及分类
        photo_filenames, class_names = _get_filename_and_classes(DATASET_DIR)
        #把分类转为字典模式，类似于{'house':3, 'flower':1}
        class_names_to_ids = dict(zip(class_names,range(len(class_names))))

        #吧数据切分为训练集和测试集
        random.seed(_RANDOM_SEED)
        random.shuffle(photo_filenames) #打乱
        training_filenames = photo_filenames[_NUM_TEST:]
        testing_filenames = photo_filenames[:_NUM_TEST] #0-500的存在测试集里

        #数据转换
        _convert_dataset('train',training_filenames,class_names_to_ids,DATASET_DIR)
        _convert_dataset('test',testing_filenames,class_names_to_ids,DATASET_DIR)

        #输出labels文件
        labels_to_class_names = dict(zip(range(len(class_names)),class_names))
        write_label_file(labels_to_class_names, DATASET_DIR)

接下来首先看到slim/datasets中dataset_factory.py，在其datasets_map中添加”’myimages : myimages,’”,在其前添加import myimages ,接下来新建myimages.py，代码如下，根据其他文件修改：

# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Provides data for the flowers dataset.

The dataset scripts used to create the dataset can be found at:
tensorflow/models/research/slim/datasets/download_and_convert_flowers.py
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import tensorflow as tf

from datasets import dataset_utils

slim = tf.contrib.slim

_FILE_PATTERN = 'image_%s_*.tfrecord'

SPLITS_TO_SIZES = {'train': 250, 'validation': 50} #可根据实际修改

_NUM_CLASSES = 10 #多少分类，可根据实际修改

_ITEMS_TO_DESCRIPTIONS = {
    'image': 'A color image of varying size.',
    'label': 'A single integer between 0 and 4',
}


def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
  """Gets a dataset tuple with instructions for reading flowers.

  Args:
    split_name: A train/validation split name.
    dataset_dir: The base directory of the dataset sources.
    file_pattern: The file pattern to use when matching the dataset sources.
      It is assumed that the pattern contains a '%s' string so that the split
      name can be inserted.
    reader: The TensorFlow reader type.

  Returns:
    A `Dataset` namedtuple.

  Raises:
    ValueError: if `split_name` is not a valid train/validation split.
  """
  if split_name not in SPLITS_TO_SIZES:
    raise ValueError('split name %s was not recognized.' % split_name)

  if not file_pattern:
    file_pattern = _FILE_PATTERN
  file_pattern = os.path.join(dataset_dir, file_pattern % split_name)

  # Allowing None in the signature so that dataset_factory can use the default.
  if reader is None:
    reader = tf.TFRecordReader

  keys_to_features = {
      'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
      'image/format': tf.FixedLenFeature((), tf.string, default_value='png'),
      'image/class/label': tf.FixedLenFeature(
          [], tf.int64, default_value=tf.zeros([], dtype=tf.int64)),
  }

  items_to_handlers = {
      'image': slim.tfexample_decoder.Image(),
      'label': slim.tfexample_decoder.Tensor('image/class/label'),
  }

  decoder = slim.tfexample_decoder.TFExampleDecoder(
      keys_to_features, items_to_handlers)

  labels_to_names = None
  if dataset_utils.has_labels(dataset_dir):
    labels_to_names = dataset_utils.read_label_file(dataset_dir)

  return slim.dataset.Dataset(
      data_sources=file_pattern,
      reader=reader,
      decoder=decoder,
      num_samples=SPLITS_TO_SIZES[split_name],
      items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,
      num_classes=_NUM_CLASSES,
      labels_to_names=labels_to_names)

接下来在slim文件夹中创建批处理文件train.bat,如下：

python3 D:/Tensorflow/slim/train_image_classifier.py ^
--train_dir=D:/Tensorflow/slim/model ^
--dataset_name=myimages ^
--dataset_split_name=train ^
--dataset_dir=D:/Tensorflow/slim/images ^
--batch_size=10^
--max_number_of_steps=1000 ^
--model_name=inception_v3 ^
pause

其中第一行运行slim文件夹中的train_image_classifier.py文件，train_dir为模型保存位置，先定义好model文件夹，dataset_name传入文件名，batch_size为批次大小，默认值32，需根据gpu内存修改，max_number_of_steps需要给个最大训练步数，否则会一直训练。以上参数在train_image_classifier.py中均有描述，可以源代码看看。

这个运算需要花费很长时间，仅建议有很大数据，有需要且条件好的去尝试。

Tensorflow视频学习总结文档

猜你喜欢