计算均值和方差的方法

  1. def compute_mean_std(cifar100_dataset):
  2. """compute the mean and std of cifar100 dataset
  3. Args:
  4. cifar100_training_dataset or cifar100_test_dataset
  5. witch derived from class torch.utils.data
  6. Returns:
  7. a tuple contains mean, std value of entire dataset
  8. """
  9. data_r = numpy.dstack([cifar100_dataset[i][1][:, :, 0] for i in range(len(cifar100_dataset))])
  10. data_g = numpy.dstack([cifar100_dataset[i][1][:, :, 1] for i in range(len(cifar100_dataset))])
  11. data_b = numpy.dstack([cifar100_dataset[i][1][:, :, 2] for i in range(len(cifar100_dataset))])
  12. mean = numpy.mean(data_r), numpy.mean(data_g), numpy.mean(data_b)
  13. std = numpy.std(data_r), numpy.std(data_g), numpy.std(data_b)
  14. return mean, std
  15. # 结果
  16. mean = {
  17. 'cifar10': (0.4914, 0.4822, 0.4465),
  18. 'cifar100': (0.5071, 0.4867, 0.4408),
  19. }
  20. std = {
  21. 'cifar10': (0.2470, 0.2435, 0.2616),
  22. 'cifar100': (0.2675, 0.2565, 0.2761),
  23. }

数据增强

https://blog.csdn.net/see_you_yu/article/details/106722787

MNIST

包含60,000个示例的训练集以及10,000个示例的测试集

28*28=784

均值 0.1307
标准差 0.3081

GTSRB

43 classes of traffic signs, split into 39,209 training images and 12,630 test images

Cifar-10

参考:https://www.cnblogs.com/Jerry-Dong/p/8109938.html

基本信息

是Tiny Images数据集的子集,Tiny Images数据集的作者 have decided to withdraw it because it contains offensive content, and have asked the community to stop using it.

10个类,每个类6000张图,共60000张图片
其中50000张作为训练集,10000张作为测试集
shape: 32x32

均值和方差

参考:https://gist.github.com/weiaicunzai/e623931921efefd4c331622c344d8151
均值 [0.4913997551666284, 0.48215855929893703, 0.4465309133731618]
标准差 [0.24703225141799082, 0.24348516474564, 0.26158783926049628]

具体类别

0 airplane
1 automobile
2 bird
3 cat
4 deer
5 dog
6 frog
7 horse
8 ship
9 truck

图像数据集 - 图1

Cifar-100

共60000张32x32的图片
共100个类,100个类又被分组为20个超类
每个类600张图片,500张用作训练集,100张用作测试机
每张图片有两个标签

具体类别

参考:https://blog.csdn.net/qq_36653505/article/details/87864405

Superclass Classes
aquatic mammals beaver, dolphin, otter, seal, whale
fish aquarium fish, flatfish, ray, shark, trout
flowers orchids, poppies, roses, sunflowers, tulips
food containers bottles, bowls, cans, cups, plates
fruit and vegetables apples, mushrooms, oranges, pears, sweet peppers
household electrical devices clock, computer keyboard, lamp, telephone, television
household furniture bed, chair, couch, table, wardrobe
insects bee, beetle, butterfly, caterpillar, cockroach
large carnivores bear, leopard, lion, tiger, wolf
large man-made outdoor things bridge, castle, house, road, skyscraper
large natural outdoor scenes cloud, forest, mountain, plain, sea
large omnivores and herbivores camel, cattle, chimpanzee, elephant, kangaroo
medium-sized mammals fox, porcupine, possum, raccoon, skunk
non-insect invertebrates crab, lobster, snail, spider, worm
people baby, boy, girl, man, woman
reptiles crocodile, dinosaur, lizard, snake, turtle
small mammals hamster, mouse, rabbit, shrew, squirrel
trees maple, oak, palm, pine, willow
vehicles 1 bicycle, bus, motorcycle, pickup truck, train
vehicles 2 lawn-mower, rocket, streetcar, tank, tractor

ImageNet

14,197,122张图片(1400万张图片)
2万多个类
按224*224读取?

常用的是ISLVRC 2012

训练集:1,281,167张图片及其标签
验证集:50,000张图片及其标签
测试集:100,000张图片及其标签
类别:https://image-net.org/challenges/LSVRC/2014/browse-synsets
使用:https://blog.csdn.net/weixin_43002433/article/details/106225771

TinyImageNet

类的数量:200
训练集总数量:100000
每个类的训练集图片数:500
测试集总数量:10000
每个类的验证集图片数:50
每个类的测试集图片数:50
图片size:64 x 64

miniImageNet

miniImageNet包含100类共60000张彩色图片,其中每类有600个样本,每张图片的规格为84 × 84。