计算均值和方差的方法
def compute_mean_std(cifar100_dataset):"""compute the mean and std of cifar100 datasetArgs:cifar100_training_dataset or cifar100_test_datasetwitch derived from class torch.utils.dataReturns:a tuple contains mean, std value of entire dataset"""data_r = numpy.dstack([cifar100_dataset[i][1][:, :, 0] for i in range(len(cifar100_dataset))])data_g = numpy.dstack([cifar100_dataset[i][1][:, :, 1] for i in range(len(cifar100_dataset))])data_b = numpy.dstack([cifar100_dataset[i][1][:, :, 2] for i in range(len(cifar100_dataset))])mean = numpy.mean(data_r), numpy.mean(data_g), numpy.mean(data_b)std = numpy.std(data_r), numpy.std(data_g), numpy.std(data_b)return mean, std# 结果mean = {'cifar10': (0.4914, 0.4822, 0.4465),'cifar100': (0.5071, 0.4867, 0.4408),}std = {'cifar10': (0.2470, 0.2435, 0.2616),'cifar100': (0.2675, 0.2565, 0.2761),}
数据增强
https://blog.csdn.net/see_you_yu/article/details/106722787
MNIST
包含60,000个示例的训练集以及10,000个示例的测试集
28*28=784
GTSRB
43 classes of traffic signs, split into 39,209 training images and 12,630 test images
Cifar-10
参考:https://www.cnblogs.com/Jerry-Dong/p/8109938.html
基本信息
是Tiny Images数据集的子集,Tiny Images数据集的作者 have decided to withdraw it because it contains offensive content, and have asked the community to stop using it.
10个类,每个类6000张图,共60000张图片
其中50000张作为训练集,10000张作为测试集
shape: 32x32
均值和方差
参考:https://gist.github.com/weiaicunzai/e623931921efefd4c331622c344d8151
均值 [0.4913997551666284, 0.48215855929893703, 0.4465309133731618]
标准差 [0.24703225141799082, 0.24348516474564, 0.26158783926049628]
具体类别
0 airplane
1 automobile
2 bird
3 cat
4 deer
5 dog
6 frog
7 horse
8 ship
9 truck
Cifar-100
共60000张32x32的图片
共100个类,100个类又被分组为20个超类
每个类600张图片,500张用作训练集,100张用作测试机
每张图片有两个标签
具体类别
参考:https://blog.csdn.net/qq_36653505/article/details/87864405
| Superclass | Classes |
|---|---|
| aquatic mammals | beaver, dolphin, otter, seal, whale |
| fish | aquarium fish, flatfish, ray, shark, trout |
| flowers | orchids, poppies, roses, sunflowers, tulips |
| food containers | bottles, bowls, cans, cups, plates |
| fruit and vegetables | apples, mushrooms, oranges, pears, sweet peppers |
| household electrical devices | clock, computer keyboard, lamp, telephone, television |
| household furniture | bed, chair, couch, table, wardrobe |
| insects | bee, beetle, butterfly, caterpillar, cockroach |
| large carnivores | bear, leopard, lion, tiger, wolf |
| large man-made outdoor things | bridge, castle, house, road, skyscraper |
| large natural outdoor scenes | cloud, forest, mountain, plain, sea |
| large omnivores and herbivores | camel, cattle, chimpanzee, elephant, kangaroo |
| medium-sized mammals | fox, porcupine, possum, raccoon, skunk |
| non-insect invertebrates | crab, lobster, snail, spider, worm |
| people | baby, boy, girl, man, woman |
| reptiles | crocodile, dinosaur, lizard, snake, turtle |
| small mammals | hamster, mouse, rabbit, shrew, squirrel |
| trees | maple, oak, palm, pine, willow |
| vehicles 1 | bicycle, bus, motorcycle, pickup truck, train |
| vehicles 2 | lawn-mower, rocket, streetcar, tank, tractor |
ImageNet
14,197,122张图片(1400万张图片)
2万多个类
按224*224读取?
常用的是ISLVRC 2012
训练集:1,281,167张图片及其标签
验证集:50,000张图片及其标签
测试集:100,000张图片及其标签
类别:https://image-net.org/challenges/LSVRC/2014/browse-synsets
使用:https://blog.csdn.net/weixin_43002433/article/details/106225771
TinyImageNet
类的数量:200
训练集总数量:100000
每个类的训练集图片数:500测试集总数量:10000
每个类的验证集图片数:50
每个类的测试集图片数:50
图片size:64 x 64
miniImageNet
miniImageNet包含100类共60000张彩色图片,其中每类有600个样本,每张图片的规格为84 × 84。
