使用TensorFlow一步步进行目标检测(2)

本文翻译自Medium上的文章：Step by Step TensorFlow Object Detection API Tutorial — Part 2: Converting Existing Dataset to TFRecord，原文地址：https://medium.com/@WuStangDan/step-by-step-tensorflow-object-detection-api-tutorial-part-2-converting-dataset-to-tfrecord-47f24be9248d

在上一篇文章<<使用TensorFlow一步步进行目标检测(1)>>中，我们选择了目标检测的预训练模型。在这篇文章中，我将展示如何将数据集转换为TFRecord文件，这样我们就可以使用该数据集对模型进行再训练。这是整个过程中最棘手的部分之一，除非我们所选择的数据集是采用的特定格式，否则还需要编写一些代码来处理数据集。

如上一篇文章所述，在本教程中，我们将创建一个可以识别交通信号灯状态的交通信号灯分类器。预训练的模型能够识别图像中的交通灯，但不能识别状态（绿色、黄色、红色等）。我决定使用Bosch Small Traffic Light Dataset这个数据集，这似乎是我想要完成的任务的理想选择。

数据集标签

TensorFlow目标检测API要求所有标记的训练数据都采用TFRecord文件格式。如果我们的数据集如PASCAL VOC数据集那样附带存储在单个.xml文件中的标签，那么我们可以使用名为create_pascal_tf_record.py的文件（可能需要稍作修改）将数据集转换为TFRecord文件。

不幸的是，我们必须编写自己的脚本以从数据集创建TFRecord文件。Bosch数据集的标签全部存储在单个.yaml文件中，其片段如下所示：

- boxes:
  - {label: Green, occluded: false, x_max: 582.3417892052, x_min: 573.3726437481,
    y_max: 276.6271175345, y_min: 256.3114627642}
  - {label: Green, occluded: false, x_max: 517.6267821724, x_min: 510.0276868266,
    y_max: 273.164089267, y_min: 256.4279864221}
  path: ./rgb/train/2015-10-05-16-02-30_bag/720654.png
- boxes: []
  path: ./rgb/train/2015-10-05-16-02-30_bag/720932.png

TFRecord将整个数据集的所有标签（边界框）和图像组合到一个文件中。虽然创建TFRecord文件有点痛苦，但一旦创建了它就非常方便。

创建单个的TFRecord条目

TensorFlow在文件using_your_own_dataset.md中为我们提供了一个示例脚本：

def create_tf_example(label_and_data_info):
  # TODO START: Populate the following variables from your example.
  height = None # Image height
  width = None # Image width
  filename = None # Filename of the image. Empty if image is not from file
  encoded_image_data = None # Encoded image bytes
  image_format = None # b'jpeg' or b'png'

  xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
  xmaxs = [] # List of normalized right x coordinates in bounding box
             # (1 per box)
  ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
  ymaxs = [] # List of normalized bottom y coordinates in bounding box
             # (1 per box)
  classes_text = [] # List of string class name of bounding box (1 per box)
  classes = [] # List of integer class id of bounding box (1 per box)
  # TODO END
  tf_label_and_data = tf.train.Example(features=tf.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
      'image/encoded': dataset_util.bytes_feature(encoded_image_data),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
  }))
  return tf_label_and_data

上面的函数给出了从.yaml文件中提取单个图像的标签和数据信息。使用此信息，您需要编写代码来填充所有给定的变量。请注意，除了边界框和类信息之外，还必须提供编码图像数据，这可以使用tensorflow.gifle.GFile()函数实现。填充所有这些变量后，您就可以转到脚本的第二部分了。

创建整个TFRecord文件

完成create_tf_record函数后，您只需创建一个循环来为数据集中的每个标签调用该函数。

import tensorflow as tf
from object_detection.utils import dataset_util

flags = tf.app.flags
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS

def create_tf_example(data_and_label_info):
  ...
  ...
  return tf_data_and_label

def main(_):
  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

  # TODO START: Write code to read in your dataset to examples variable
  file_loc = None
  all_data_and_label_info = LOAD(file_loc)
  # TODO END

  for data_and_label_info in all_data_and_label_info:
    tf_example = create_tf_example(data_and_label_info)
    writer.write(tf_example.SerializeToString())

  writer.close()

if __name__ == '__main__':
  tf.app.run()

完成后以上代码后，您就可以运行脚本了。如果您想查看完整的示例，Anthony Sarkis对Bosch数据集的TFRecord脚本有一个非常完整的实现。

如果之前未修改.bashrc文件，请确保在运行此脚本之前在终端窗口中运行export PYTHONPATH语句。在包含TFRecord脚本的文件夹中，并将数据（图像）放在.yaml（或包含图像路径的其他文件）中列出的相同位置，运行以下命令。

python tf_record.py --output_path training.record

为确保我们正确完成了所有操作，可以将创建的训练记录文件的大小与包含所有训练图像的文件夹的大小进行比较。如果它们几乎完全相同，那就完成了！

您的数据集可能会有一个单独的训练和评估数据集，请确保为每个文件创建单独的TFRecord文件。

在下一篇文章中，我将展示如何创建自己的数据集，这样我们还可以进一步提升模型的性能！