ML-datasets -- 物体识别

[toc]

开源数据集-物体识别:

Cifar10:go: ref:

http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

http://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz

http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz

该数据集文件包含data_batch1……data_batch5,和test_batch。他们都是由cPickle库产生的序列化后的对象(关于pickle,移步https://docs.python.org/3/library/pickle.html)。

1
2
3
4
5
def unpickle(file):
import pickle
with open(file, 'rb') as fo:
dict = pickle.load(fo, encoding='bytes')
return dict

Cifar100 go:

Version Size md5sum
CIFAR-100 python version 161 MB eb9058c3a382ffc7106e4002c42a8d85
CIFAR-100 Matlab version 175 MB 6a4bfa1dcd5c9453dda6bb54194911f4
CIFAR-100 binary version (suitable for C programs) 161 MB 03b5dce01913d631647c71ecec9e9cb8

VOC:

LSUN: go

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

img

国外的PASCAL
VOC和ImageNet ILSVRC比赛使用的数据集,数据领域包括卧室、冰箱、教师、厨房、起居室、酒店等多个主题。

推荐度:★★,推荐应用方向:图像识别

介绍和下载地址:http://lsun.cs.princeton.edu

Abstract

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models. Lagging behind the growth in model capacity, the available datasets are quickly becoming outdated in terms of size and density. To circumvent this bottleneck, we propose to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop. Starting from a large set of candidate images for each category, we iteratively sample a subset, ask people to label them, classify the others with a trained model, split the set into positives, negatives, and unlabeled based on the classification confidence, and then iterate with the unlabeled set. To assess the effectiveness of this cascading procedure and enable further progress in visual recognition research, we construct a new image dataset, LSUN. It contains around one million labeled images for each of 10 scene categories and 20 object categories. We experiment with training popular convolutional networks and find that they achieve substantial performance gains when trained on this dataset.

Paper

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser and Jianxiong Xiao
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
arXiv:1506.03365 [cs.CV], 10 Jun 2015

Data

10 scene categories for LSUN Scene Classification Challange: Downloading Code

20 object categories: Link List. Images for each category are stored in LMDB format and the database is then zipped. After downloading and decompressing the zip files, please to refer to LSUN utility code to visualize and export the images. MD5 sum for each zip file is also provided so that you can verify your downloads.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
../
airplane.zip 06-Mar-2019 00:14 34G
airplane.zip.md5 19-Dec-2019 20:04 47
bicycle.zip 06-Mar-2019 00:44 129G
bicycle.zip.md5 19-Dec-2019 20:04 46
bird.zip 06-Mar-2019 00:57 65G
bird.zip.md5 19-Dec-2019 20:04 43
boat.zip 06-Mar-2019 01:12 86G
boat.zip.md5 19-Dec-2019 20:04 43
bottle.zip 06-Mar-2019 01:24 64G
bottle.zip.md5 19-Dec-2019 20:04 45
bus.zip 06-Mar-2019 01:29 24G
bus.zip.md5 19-Dec-2019 20:04 42
car.zip 06-Mar-2019 02:05 173G
car.zip.md5 19-Dec-2019 20:04 42
cat.zip 06-Mar-2019 02:12 42G
cat.zip.md5 19-Dec-2019 20:04 42
chair.zip 06-Mar-2019 02:31 116G
chair.zip.md5 19-Dec-2019 20:04 44
cow.zip 06-Mar-2019 02:34 15G
cow.zip.md5 19-Dec-2019 20:04 42
dining_table.zip 06-Mar-2019 02:50 48G
dining_table.zip.md5 19-Dec-2019 20:04 51
dog.zip 06-Mar-2019 03:14 145G
dog.zip.md5 19-Dec-2019 20:04 42
horse.zip 06-Mar-2019 03:25 69G
horse.zip.md5 19-Dec-2019 20:04 44
motorbike.zip 06-Mar-2019 03:32 42G
motorbike.zip.md5 19-Dec-2019 20:04 48
person.zip 06-Mar-2019 04:47 477G
person.zip.md5 19-Dec-2019 20:04 45
potted_plant.zip 06-Mar-2019 04:54 43G
potted_plant.zip.md5 19-Dec-2019 20:04 51
sheep.zip 06-Mar-2019 04:57 18G
sheep.zip.md5 19-Dec-2019 20:04 44
sofa.zip 06-Mar-2019 05:06 56G
sofa.zip.md5 19-Dec-2019 20:04 43
train.zip 06-Mar-2019 05:13 43G
train.zip.md5 19-Dec-2019 20:04 44
tv-monitor.zip 06-Mar-2019 05:21 46G
tv-monitor.zip.md5 19-Dec-2019 20:04 49

LSUN Challenge

In CVPR 2015 and 2016, a image classification challenge has been hosted in LSUN Challenge workshop to evaluate the progress of large-scale image understanding. More information can be found at the challenge webpage.

ImageNet数据集

ImageNet数据集是目前深度学习图像领域应用得非常多的一个领域,该数据集有1000多个图像,涵盖图像分类、定位、检测等应用方向。Imagenet数据集文档详细,有专门的团队维护,在计算机视觉领域研究论文中应用非常广,几乎成为了目前深度学习图像领域算法性能检验的“标准”数据集。很多大型科技公司都会参加ImageNet图像识别大赛,包括百度、谷歌、微软等。

推荐度:★★★,推荐应用方向:图像识别

介绍和下载地址:http://www.image-net.org/

Tiny Images Dataset

该数据集由79302017张图像组成,每张图像为32x32彩色图像。 该数据以二进制文件的形式存储,大约有400Gb图像。

推荐度:★★,推荐应用方向:图像识别

介绍和下载地址:http://horatio.cs.nyu.edu/mit/tiny/data/index.html

CoPhIR

CoPhIR是从Flickr中采集的大概1.06亿个图像数据集,图像中不仅包含了图表本身的数据,例如位置、标题、GPS、标签、评论等,还可提取出颜色模式、颜色布局、边缘直方图、均匀纹理等数据。

推荐度:★★,推荐应用方向:图像识别

介绍和下载地址:http://cophir.isti.cnr.it/whatis.html

Labeled Faces in the Wild数据集

该数据集是用于研究无约束面部识别问题的面部照片数据库。数据集包含从网络收集的13000多张图像。每张脸都贴上了所画的人的名字,图片中的1680人在数据集中有两个或更多不同的照片。

推荐度:★★,推荐应用方向:人脸识别

介绍和下载地址:http://vis-www.cs.umass.edu/lfw/

SVHN

SVHN数据来源于 Google 街景视图中房屋信息,它是一个真实世界的图像数据集,用于开发机器学习和对象识别算法,对数据预处理和格式化的要求最低。它跟MNIST相似,但是包含更多数量级的标签数据(超过60万个数字图像),并且来源更加多样,用来识别自然场景图像中的数字。

推荐度:★★,推荐应用方向:机器学习、图像识别

介绍和下载地址:http://ufldl.stanford.edu/housenumbers/

MS COCO

COCO(Common Objects in Context)是一个新的图像识别、分割和图像语义数据集,由微软赞助,图像中不仅有标注类别、位置信息,还有对图像的语义文本描述。COCO数据集的开源使得近两、三年来图像分割语义理解取得了巨大的进展,也几乎成为了图像语义理解算法性能评价的“标准”数据集。

推荐度:★★★,推荐应用方向:图像识别、图像语义理解

介绍和下载地址:http://mscoco.org/

谷歌YouTube-8M

YouTube-8M一个大型的多样性标注的视频数据集,目前拥有700万的YouTube视频链接、45万小时视频时长、3.2亿视频/音频特征、4716个分类、平均每个视频拥有3个标签。

推荐度:★★★,推荐应用方向:视频理解、表示学习(representation learning)、嘈杂数据建模、转移学习(transfer learning)和视频域适配方法(domain
adaptation approaches)

数据集介绍和下载地址:https://research.google.com/youtube8m/

Udacity开源的车辆行使视频数据集

数据集大概有223G,主要是有关车辆驾驶的数据,其中除了车辆拍摄的图像以外,还包括车辆本身的属性和参数信息,例如经纬度、制动器、油门、转向度、转速等。这些数据可用于车辆自动驾驶方向的模型训练和学习。

推荐度:★★★,推荐应用方向:自动驾驶

介绍和下载地址:https://github.com/udacity/self-driving-car

牛津RobotCar视频数据集

RobotCar数据集包含时间范围超过1年,测试超过100次的相同路线的驾驶数据。数据集采集了天气、交通、行人、建筑和道路施工等不同组合的数据。

推荐度:★★★,推荐应用方向:自动驾驶

介绍和下载地址:http://robotcar-dataset.robots.ox.ac.uk/

Udacity开源的自然场景短视频数据集

数据集大概为9T,由3500万个视频剪辑组成,每个视频为短视频(32帧),大约1秒左右的时长。

推荐度:★★★,推荐应用方向:目标跟踪、视频目标识别

介绍和下载地址:http://web.mit.edu/vondrick/tinyvideo/#data

3. 自然语言数据集

【todo】

ref:

https://www.zhihu.com/question/63383992

http://blog.itpub.net/29829936/viewspace-2219159/