18.train
训练的大致步骤可以分为:
dataset下载数据集
dataloader加载数据集
创建神经网络
定义损失函数和损失函数及其参数
创建tensorboard
设置一些训练使用的参数
开始训练:[训练],[验证],保存训练后的模型
训练时,包含:[][][计算损失], [梯度清零], [反向传播], [参数更新]
验证时,我们需要计算正确率。
for data in test_dataloader:imgs,targets = dataoutput = demo(imgs)loss = loss_fn(output, targets)total_test_loss = total_test_loss + losspre = output.argmax(1)acc = ((pre == targets).sum())total_acc = total_acc + accprint("total_test_loss = {}".format(total_test_loss))print("total_acc = {}".format(total_acc))writer.add_scalar("test_loss",total_test_loss,i)writer.add_scalar("total_acc",total_acc,i)
每次for循环都有64个图片被使用
output = [0.1, 0.2, 0.3, 0.4 ,0.5, 0.6, 0.7, 0.8, 0.9, 0.1]
……共64个[0.1, 0.2, 0.3, 0.4 ,0.5, 0.6, 0.7, 0.8, 0.9, 0.1]每位及代表类别class 0 - 9 类
targets = [1]
……共64个[3]
使用argmax将output变为和targets一样的形式作对比
output.argmax(1) = [9]
……共64个[9]
(output.argmax(1) == targets) = (F,F,T……F,T) = (0,0,1….,0,1) 相等为T 不等为F,共64个
(output.argmax(1) == targets).sum() = 将 (0,0,1….,0,1)中元素相加,即等于当前batch_size中判断正确的数量,每次for循环累加一次,跑完整个数据集后可以算出当前epoch验证出的精确度。
argmax(1) >>tensor([1,1])
argmax(0) >>tensor([1,1])
# argmax and accuracyimport torchoutputs = torch.tensor([[0.1,0.2],[0.3,0.4]])print(outputs.argmax(1))pre = outputs.argmax(1)targets = torch.tensor([0,1])print(pre == targets)print((pre == targets).sum())# whne input = 2 , output = 2# accuracy = ((pre = targets).sum()) / 2
code:
import torchimport torchvisionfrom torch.utils.data import DataLoaderfrom torch.utils.tensorboard import SummaryWriterfrom models import *# 准备数据集train_data = torchvision.datasets.CIFAR10(root="./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)test_data = torchvision.datasets.CIFAR10(root="./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)train_data_size = len(train_data)test_data_size = len(test_data)print("训练数据长度为:{}".format(train_data_size))print("测试数据长度为:{}".format(test_data_size))# 加载数据集train_dataloader = DataLoader(train_data,batch_size=64)test_dataloader = DataLoader(test_data,batch_size=64)# 创建神经网络demo = DEMO()# 定义损失函数loss_fn = nn.CrossEntropyLoss()# 定义优化器#learing_rate = 1e-2 = 1 × (10) ^ (-2)learning_rate = 0.01optimizer = torch.optim.SGD(demo.parameters(),lr=learning_rate)# tensorboardwriter = SummaryWriter("./train_logs")# 训练# 设置一些参数total_train_step = 0total_test_step = 0epoch = 5for i in range(epoch):print("--------------epoch:{}----------------".format(i+1))# 训练demo.train() # 在使用一些特殊层时要调用,不适用时也可以调用,实践中很常用 同 demo.eval()for data in train_dataloader:imgs, targets = dataoutput = demo(imgs)loss = loss_fn(output,targets)optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:print("训练次数:{},train_Loss = {}".format(total_train_step,loss.item()))writer.add_scalar("train_loss",loss.item(),total_train_step)# 验证demo.eval()total_test_loss = 0# 梯度为零,不优化参数total_acc = 0with torch.no_grad():for data in test_dataloader:imgs,targets = dataoutput = demo(imgs)loss = loss_fn(output, targets)total_test_loss = total_test_loss + losspre = output.argmax(1)acc = ((pre == targets).sum())total_acc = total_acc + accaccuracy = total_acc / test_data_sizeprint("total_test_loss = {}".format(total_test_loss))print("total_acc = {}".format(accuracy))writer.add_scalar("test_loss",total_test_loss,i+1)writer.add_scalar("accuracy",accuracy,i+1)torch.save(demo,"pretrained_demo_{}.pth".format(i+1))print("pretrained_demo已保存")writer.close()
