Faster R-CNN and Mask R-CNN in PyTorch 1.0（翻译自用）

同上一篇文档相同，也是翻译github上的README.MD文档，方便理解项目，翻译得有些野生，还请见谅侵删
maskrcnn-bench基准已经被弃用.请看detectron2, 这包含了用 maskrcnn-benchmark搭建模型的所有实施方法

该项目旨在为使用Pythorc1.0创建检测和分割模型提供必要的构件。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vOM0TqMI-1597029193008)(demo/demo_e2e_mask_rcnn_X_101_32x8d_FPN_1x.png "from http://cocodataset.org/#explore?id=345434")]

Highlights亮点

PyTorch 1.0: RPN, Faster R-CNN and Mask R-CNN implementations that matches or exceeds Detectron accuracies
非常快: 训练时，比Detectron 快2倍还要多，并且比mmdetection快30% 。看 MODEL_ZOO.md 获取更多细节.
内存效率: 在训练期间使用的GPU内存比mmdetection少大约500MB
多GPU训练与推理
混合精度训练: 在NVIDIA tensor cores上使用更少的GPU内存，训练速度更快.
批处理推理: 可以使用每个GPU的每个批处理多个图像来执行推理
CPU支持测试: 测试在CPU上运行。看 webcam demo 获取更多案例
为几乎所有的参考 Mask R-CNN和Faster R-CNN提供预先训练的模型，配置1x时间表。

网络摄像头和Jupyter notebook演示

我们提供了一个简单的网络摄像头演示，演示如何使用“maskrcnn_benchmark”进行推理：

cd demo
# by default, it runs on the GPU
# for best results, use min-image-size 800
python webcam.py --min-image-size 800
# can also run it on the CPU
python webcam.py --min-image-size 300 MODEL.DEVICE cpu
# or change the model that you want to use
python webcam.py --config-file ../configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu
# in order to see the probability heatmaps, pass --show-mask-heatmaps
python webcam.py --min-image-size 300 --show-mask-heatmaps MODEL.DEVICE cpu
# for the keypoint demo
python webcam.py --config-file ../configs/caffe2/e2e_keypoint_rcnn_R_50_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu

带有演示的notebook可以在 demo/Mask_R-CNN_demo.ipynb中查看.

安装

查看INSTALL.md 文档了解安装指南.

Model Zoo and Baselines

预训练模型、基线及与Detectron和mmdetection的比较可以在 MODEL_ZOO.md中查看。

Inference in a few lines

我们提供了一个helper类来简化使用预先训练的模型编写推理管道。
下面是我们怎么做的。在“demo”文件夹运行：

from maskrcnn_benchmark.config import cfg
from predictor import COCODemo

config_file = "../configs/caffe2/e2e_mask_rcnn_R_50_FPN_1x_caffe2.yaml"

# update the config options with the config file
cfg.merge_from_file(config_file)
# manual override some options
cfg.merge_from_list(["MODEL.DEVICE", "cpu"])

coco_demo = COCODemo(
    cfg,
    min_image_size=800,
    confidence_threshold=0.7,
)
# load image and then run prediction
image = ...
predictions = coco_demo.run_on_opencv_image(image)

在CoCo数据集上训练

为了运行下面的案例，你需要先安装 maskrcnn_benchmark.

你还需要下载CoCo数据集
我们建议将coco数据集的路径符号链接到“datasets/”，如下所示
我们使用来自 Detectron的minival和valminusminival集合

# symlink the coco dataset
cd ~/github/maskrcnn-benchmark
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2014 datasets/coco/train2014
ln -s /path_to_coco_dataset/test2014 datasets/coco/test2014
ln -s /path_to_coco_dataset/val2014 datasets/coco/val2014
# or use COCO 2017 version
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/test2017 datasets/coco/test2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

# for pascal voc dataset:
ln -s /path_to_VOCdevkit_dir datasets/voc

P.S. COCO_2017_train = COCO_2014_train + valminusminival , COCO_2017_val = minival

您也可以配置自己的数据集路径。
为此，您只需修改maskrcnn_benchmark/config/paths_catalog.py 到指向存储数据集的位置。
你也可以创建一个新的 paths_catalog.py 实现相同的两个类的文件，并在训练期间将其作为配置参数 PATHS_CATALOG 传递。

单个 GPU 训练

我们提供的大多数配置文件都假设我们运行在8GPU上。
为了能够在更少的GPU上运行它，有几种可能性：
1.运行以下操作，不做任何修改

python /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "/path/to/config/file.yaml"

这应该是开箱即用的，与我们应该为多GPU训练所做的非常相似。
但缺点是它将使用更多的GPU内存。原因是我们把配置文件全局批处理大小除以GPU的数量。所以如果我们只有一个GPU，这意味着该GPU的批处理大小将增大8倍，这可能导致内存不足错误。

如果你有很多可用的内存，这是最简单的解决方案。

2. 修改cfg参数

如果遇到内存不足错误，可以减小全局batch size。
但这意味着您还需要更改学习速率、迭代次数和学习速率计划。以下是带有1x计划的Mask R-CNN R-50 FPN的示例：

python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1 MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000

这依据于scheduling rules from Detectron.
请注意，我们已经将迭代次数乘以8倍（以及学习速率计划），我们将学习率除以8倍。

我们在测试期间也改变了batch size，但这通常是不必要的，因为测试比训练需要更少的内存。

此外，我们设置 MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000，因为在训练中，建议是默认按批次而不是按图像选择的。该值的计算为 1000 x images-per-gpu。这里每个GPU有2个图像，因此我们将数字设置为1000 x 2=2000。如果每个GPU有8个图像，则该值应设置为8000。请注意，如果 MODEL.RPN.FPN_POST_NMS_PER_BATCH 在训练期间设置为False就不适用了。看 #672 获取更多信息。

多-GPU 训练

We use internally torch.distributed.launch in order to launch
multi-gpu training. 我们在内部使用torch.distributed.launch，这样就可以进行多cpu训练。
Pytorch的这个实用函数产生了Python进程作为我们要使用的gpu的数量，以及每个Python进程将只使用一个GPU。

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000

注意我们应该遵循Single-GPU训练的规则来设置MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 。

混合精度训练

我们最近使用APEX 添加 Automatic Mixed Precision的支持. 要启用，只需进行单GPU或多GPU训练并设置 DTYPE "float16"。

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000 DTYPE "float16"

如果你想要更详细的日志记录，将 AMP_VERBOSE设置成True. 查看 Mixed Precision Training guide 获取更多细节.

Evaluation评估

您可以直接在单个或多个gpu上测试模型。这里有个例子是基于1x时间表在 8 GPUS上训练的 Mask R-CNN R-50 FPN :

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/test_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" TEST.IMS_PER_BATCH 16

要计算每个类的mAP ，只需在 coco_eval.py中修改几行即可。查看 #524 获取更多细节.

Abstractions

有关我们实现中一些主要 abstractions更多信息，见ABSTRACTIONS.md.

Adding your own dataset添加自己的数据集

这个实现方法增加了对COCO风格数据集的支持.
但是，添加对新数据集的训练的支持可以如下所示：

from maskrcnn_benchmark.structures.bounding_box import BoxList

class MyDataset(object):
    def __init__(self, ...):
        # as you would do normally

    def __getitem__(self, idx):
        # load the image as a PIL Image
        image = ...

        # load the bounding boxes as a list of list of boxes
        # in this case, for illustrative purposes, we use
        # x1, y1, x2, y2 order.
        boxes = [[0, 0, 10, 10], [10, 20, 50, 50]]
        # and labels
        labels = torch.tensor([10, 20])

        # create a BoxList from the boxes
        boxlist = BoxList(boxes, image.size, mode="xyxy")
        # add the labels to the boxlist
        boxlist.add_field("labels", labels)

        if self.transforms:
            image, boxlist = self.transforms(image, boxlist)

        # return the image, the boxlist and the idx in your dataset
        return image, boxlist, idx

    def get_img_info(self, idx):
        # get img_height and img_width. This is used if
        # we want to split the batches according to the aspect ratio
        # of the image, as it can be more efficient than loading the
        # image from disk
        return {
    
    "height": img_height, "width": img_width}

就这样。您还可以向boxlist添加额外的字段，例如segmentation masks
(用 structures.segmentation_mask.SegmentationMask), 或者你自己的实例类型。

对于 COCODataset 是怎么实现的整个例子，见 maskrcnn_benchmark/data/datasets/coco.py.

创建数据集后，需要将其添加到几个位置：

maskrcnn_benchmark/data/datasets/__init__.py: add it to __all__
maskrcnn_benchmark/config/paths_catalog.py: DatasetCatalog.DATASETS and corresponding if clause in DatasetCatalog.get()

Testing测试

虽然前面提到的例子可以用于训练，但是我们利用cocoApi计算测试过程中精度。因此，测试数据集目前应该遵循cocoApi。

若要启用数据集进行测试，请在maskrcnn_benchmark/data/datasets/evaluation/__init__.py中添加相应的if语句 :

if isinstance(dataset, datasets.MyDataset):
        return coco_evaluation(**args)

根据Detectron对自定义数据集的权重进行微调

新建脚本tools/trim_detectron_model.py 如 here.

您可以通过修改脚本来决定删除哪些键和保留哪些键。

然后您可以简单地在配置文件中通过更改 MODEL.WEIGHT.

想要获取更多信息，看 #15.

Troubleshooting故障排除

如果您在运行或编译此代码时遇到问题，我们已在TROUBLESHOOTING.md中编译了常见问题的列表.

如果您的问题没有出现在那里，请随时打开一个新的问题。

Citations引用

如果有助于你的研究，请考虑在你的出版物中引用这个项目。以下是BibTeX参考。BibTeX条目需要“url”LaTeX包。

@misc{massa2018mrcnn,
author = {Massa, Francisco and Girshick, Ross},
title = {
   
   {maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch}},
year = {2018},
howpublished = {\url{https://github.com/facebookresearch/maskrcnn-benchmark}},
note = {Accessed: [Insert date here]}
}

Projects using maskrcnn-benchmark

RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free.
Cheng-Yang Fu, Mykhailo Shvets, and Alexander C. Berg.
Tech report, arXiv,1901.03353.
FCOS: Fully Convolutional One-Stage Object Detection.
Zhi Tian, Chunhua Shen, Hao Chen and Tong He.
Tech report, arXiv,1904.01355. [code]
MULAN: Multitask Universal Lesion Analysis Network for Joint Lesion Detection, Tagging, and Segmentation.
Ke Yan, Youbao Tang, Yifan Peng, Veit Sandfort, Mohammadhadi Bagheri, Zhiyong Lu, and Ronald M. Summers.
MICCAI 2019. [code]
Is Sampling Heuristics Necessary in Training Deep Object Detectors?
Joya Chen, Dong Liu, Tong Xu, Shilong Zhang, Shiwei Wu, Bin Luo, Xuezheng Peng, Enhong Chen.
Tech report, arXiv,1909.04868. [code]

License

maskrcnn-benchmark is released under the MIT license. See LICENSE for additional details.