1. 均值、方差
在进行网络训练时经常会看到这样的代码:
self.image_transform=transforms.Compose([
transforms.Resize((self.size,self.size)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
其中transforms.Normalize()
是对数据进行归一化,第一个参数[0.485, 0.456, 0.406]
是均值(mean),第二个参数[0.229, 0.224, 0.225]
是方差(std)。
这两个参数([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
在Pytorch官方教程中经常出现,它其实是对ImageNet数据集抽样计算出来的。
2. 计算均值、方差
如果要对自己的数据集计算均值和方差,那么就需要自己使用代码去计算。
搜索了网上计算mean和std的代码,大致分为以下两种:
1)手动累加计算
import os
from PIL import Image
import numpy as np
def cal_mean_std_2():
filepath = 'D:\\Dataset\\OIP\\test_images'
pathDir = os.listdir(filepath)
R_channel = 0
G_channel = 0
B_channel = 0
for idx in range(len(pathDir)):
filename = pathDir[idx]
with open(os.path.join(filepath, filename),'rb') as f:
img=Image.open(f).convert('RGB')
img=np.asarray(img)/255.
R_channel = R_channel + np.sum(img[:, :, 0])
G_channel = G_channel + np.sum(img[:, :, 1])
B_channel = B_channel + np.sum(img[:, :, 2])
num = len(pathDir) * 256 * 256
R_mean = R_channel / num
G_mean = G_channel / num
B_mean = B_channel / num
R_channel = 0
G_channel = 0
B_channel = 0
for idx in range(len(pathDir)):
filename = pathDir[idx]
with open(os.path.join(filepath, filename),'rb') as f:
img=Image.open(f).convert('RGB')
img=np.asarray(img)/255.
R_channel = R_channel + np.sum((img[:, :, 0] - R_mean) ** 2)
G_channel = G_channel + np.sum((img[:, :, 1] - G_mean) ** 2)
B_channel = B_channel + np.sum((img[:, :, 2] - B_mean) ** 2)
R_var = np.sqrt(R_channel / num)
G_var = np.sqrt(G_channel / num)
B_var = np.sqrt(B_channel / num)
print("R_mean is %f, G_mean is %f, B_mean is %f" % (R_mean, G_mean, B_mean))
print("R_var is %f, G_var is %f, B_var is %f" % (R_var, G_var, B_var))
2)内置函数计算
import os
from PIL import Image
import numpy as np
def cal_mean_std():
image_path='D:\\Dataset\\OIP\\test_images'
image_list=[]
file_names=os.listdir(image_path)
means = [0, 0, 0]
stdevs = [0, 0, 0]
for name in file_names:
if name.endswith('.jpg'):
image_list.append(os.path.join(image_path,name))
for image in image_list:
with open(image,'rb') as f:
img=Image.open(f).convert('RGB')
img=np.asarray(img)/255.
# print(img.dtype)
for i in range(3):
means[i] += img[:, :, i].mean()
stdevs[i] += img[:, :, i].std()
# means.reverse()
# stdevs.reverse()
mean = np.asarray(means) / len(image_list)
std = np.asarray(stdevs) / len(image_list)
print("mean:{},std:{}".format(mean,std))
3)计算结果对比
第一行是方法2的结果,第二和三行是方法1的结果
观看结果可以看出:
- 两种方法均值计算结果大致相同,结果的精确度有差异
- 方差计算结果差异较大
个人分析原因是mean的精度不同,两者产生一定的误差。因此在计算std的过程中,误差被平方计算扩大了。
因此,最终选择方法2作为计算mean和std的方法。