MS coco数据集下载链接

coco数据集因为官网被墙了,所以无法看到下载链接,翻墙后拷贝过来,直接用链接下载就可以。

网页格式拷贝过来后就与官网的不一样, 凑合看。

Images

2014 Train images [83K/13GB]
2014 Val images [41K/6GB]
2014 Test images [41K/6GB]
2015 Test images [81K/12GB]
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Unlabeled images [123K/19GB]

Annotations

2014 Train/Val annotations [241MB]
2014 Testing Image info [1MB]
2015 Testing Image info [2MB]
2017 Train/Val annotations [241MB]
2017 Stuff Train/Val annotations [1.1GB]
2017 Panoptic Train/Val annotations [821MB]
2017 Testing Image info [1MB]
2017 Unlabeled Image info [4MB]

1. Overview

Which dataset splits should you download? Each year's images are associated with different tasks. Specifically:

2014 Train/Val

Detection 2015Captioning 2015Detection 2016Keypoints 2016

2014 Testing

Captioning 2015

2015 Testing

Detection 2015Detection 2016Keypoints 2016

2017 Train/Val/Test
 

Detection 2017Keypoints 2017Stuff 2017,
Detection 2018Keypoints 2018Stuff 2018Panoptic 2018,
Detection 2019Keypoints 2019Stuff 2019Panoptic 2019

2017 Unlabeled

[optional data for any competition]

If you are submitting to a 2017, 2018, or 2019 task, you only need to download the 2017 images. You can disregard earlier splits. Note: the split year refers to the year the image splits were released, not the year in which the annotations were released.

For efficiently downloading the images, we recommend using gsutil rsync to avoid the download of large zip files. Please follow the instructions in the COCO API Readme to setup the downloaded COCO data (the images and annotations should go in coco/images/ and coco/annotations/). By downloading this dataset, you agree to our Terms of Use.

2019 Update: All data for all challenges stays unchanged.

2018 Update: Detection and keypoint data is unchanged. New in 2018, complete stuff and panoptic annotations for all 2017 images are available. Note: if you downloaded the stuff annotations prior to 06/17/2018, please re-download.

2017 Update: The main change in 2017 is that instead of an 83K/41K train/val split, based on community feedback the split is now 118K/5K for train/val. The same exact images are used, and no new annotations for detection/keypoints are provided. However, new in 2017 are stuff annotations on 40K train images (subset of the full 118K train images from 2017) and 5K val images. Also, for testing, in 2017 the test set only has two splits (dev / challenge), instead of the four splits (dev / standard / reserve / challenge) used in previous years. Finally, new in 2017 we are releasing 120K unlabeled images from COCO that follow the same class distribution as the labeled images; this may be useful for semi-supervised learning on COCO.

2. COCO API

The COCO API assists in loading, parsing, and visualizing annotations in COCO. The API supports multiple annotation formats (please see the data format page). For additional details see: CocoApi.mcoco.py, and CocoApi.lua for Matlab, Python, and Lua code, respectively, and also the Python API demo.

Throughout the API "ann"=annotation, "cat"=category, and "img"=image.

getAnnIds

Get ann ids that satisfy given filter conditions.

getCatIds

Get cat ids that satisfy given filter conditions.

getImgIds

Get img ids that satisfy given filter conditions.

loadAnns

Load anns with the specified ids.

loadCats

Load cats with the specified ids.

loadImgs

Load imgs with the specified ids.

loadRes

Load algorithm results and create API for accessing them.

showAnns

Display the specified annotations.

3. MASK API

COCO provides segmentation masks for every object instance. This creates two challenges: storing masks compactly and performing mask computations efficiently. We solve both challenges using a custom Run Length Encoding (RLE) scheme. The size of the RLE representation is proportional to the number of boundaries pixels of a mask and operations such as area, union, or intersection can be computed efficiently directly on the RLE. Specifically, assuming fairly simple shapes, the RLE representation is O(√n) where n is number of pixels in the object, and common computations are likewise O(√n). Naively computing the same operations on the decoded masks (stored as an array) would be O(n).

The MASK API provides an interface for manipulating masks stored in RLE format. The API is defined below, for additional details see: MaskApi.mmask.py, or MaskApi.lua. Finally, we note that a majority of ground truth masks are stored as polygons (which are quite compact), these polygons are converted to RLE when needed.

encode

Encode binary masks using RLE.

decode

Decode binary masks encoded via RLE.

merge

Compute union or intersection of encoded masks.

iou

Compute intersection over union between masks.

area

Compute area of encoded masks.

toBbox

Get bounding boxes surrounding encoded masks.

frBbox

Convert bounding boxes to encoded masks.

frPoly

Convert polygon to encoded mask.

发布了201 篇原创文章 · 获赞 20 · 访问量 41万+

猜你喜欢

转载自blog.csdn.net/zmlovelx/article/details/98885136
ms