https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html
Preparation
Import packages
import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
Get device for training
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using {device} device')
Define the Class
自定义的模型(类)继承于nn.Module
We define our neural network by subclassing nn.Module
, and initialize the neural network layers in **__init__**
.
Every nn.Module
subclass implements the operations on input data in the forward method.
每个nn.Module
子类都在 forward
方法中实现对输入数据的操作
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork,self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512,512),
nn.ReLU(),
nn.Linear(512,10),
)
def forward(self,x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
创建模型的实例
We create an instance of NeuralNetwork,and move it to the device, and print its structure.
model = NeuralNetwork().to(device)
print(model)
Use the model
把数据传递给模型,会执行模型的前向操作,以及一些后台操作,不要直接调用model.forward()
X = torch.rand(1,28,28,device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class:{y_pred}")
Model layer
input_image = torch.rand(3,28,28)
print(input_image.size())
# out:
torch.Size([3, 28, 28])
nn.Flatten
We initialize the nn.Flatten
layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values ( the minibatch dimension (at dim=0) is maintained).
flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())
#out:
torch.Size([3, 784])
nn.Linear
The linear
layer is a module that applies a linear transformation on the input using its stored weights and biases.
线性变换
layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())
#out:
torch.Size([3, 20])
nn.ReLU
激活函数
Non-linear activations are what create the complex mappings between the model’s inputs and outputs. They are applied after linear transformations to introduce nonlinearity, helping neural networks learn a wide variety of phenomena.
In this model, we use nn.ReLU
between our linear layers, but there’s other activations to introduce non-linearity in your model
print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")
nn.Sequential
模块的有序容器,数据按照定义的顺序通过所有模块nn.Sequential
is an ordered container of modules. The data is passed through all the modules in the same order as defined. You can use sequential containers to put together a quick network like seq_modules.
seq_modules = nn.Sequential(
flatten,
layer1,
nn.ReLU(),
nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)