https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html
Preparation
Import packages
import osimport torchfrom torch import nnfrom torch.utils.data import DataLoaderfrom torchvision import datasets, transforms
Get device for training
device = 'cuda' if torch.cuda.is_available() else 'cpu'print(f'Using {device} device')
Define the Class
自定义的模型(类)继承于nn.Module
We define our neural network by subclassing nn.Module, and initialize the neural network layers in **__init__**.
Every nn.Module subclass implements the operations on input data in the forward method.
每个nn.Module 子类都在 forward方法中实现对输入数据的操作
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork,self).__init__()self.flatten = nn.Flatten()self.linear_relu_stack = nn.Sequential(nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512,512),nn.ReLU(),nn.Linear(512,10),)def forward(self,x):x = self.flatten(x)logits = self.linear_relu_stack(x)return logits
创建模型的实例
We create an instance of NeuralNetwork,and move it to the device, and print its structure.
model = NeuralNetwork().to(device)print(model)
Use the model
把数据传递给模型,会执行模型的前向操作,以及一些后台操作,不要直接调用model.forward()
X = torch.rand(1,28,28,device=device)logits = model(X)pred_probab = nn.Softmax(dim=1)(logits)y_pred = pred_probab.argmax(1)print(f"Predicted class:{y_pred}")
Model layer
input_image = torch.rand(3,28,28)print(input_image.size())# out:torch.Size([3, 28, 28])
nn.Flatten
We initialize the nn.Flatten layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values ( the minibatch dimension (at dim=0) is maintained).
flatten = nn.Flatten()flat_image = flatten(input_image)print(flat_image.size())#out:torch.Size([3, 784])
nn.Linear
The linear layer is a module that applies a linear transformation on the input using its stored weights and biases.
线性变换
layer1 = nn.Linear(in_features=28*28, out_features=20)hidden1 = layer1(flat_image)print(hidden1.size())#out:torch.Size([3, 20])
nn.ReLU
激活函数
Non-linear activations are what create the complex mappings between the model’s inputs and outputs. They are applied after linear transformations to introduce nonlinearity, helping neural networks learn a wide variety of phenomena.
In this model, we use nn.ReLU between our linear layers, but there’s other activations to introduce non-linearity in your model
print(f"Before ReLU: {hidden1}\n\n")hidden1 = nn.ReLU()(hidden1)print(f"After ReLU: {hidden1}")
nn.Sequential
模块的有序容器,数据按照定义的顺序通过所有模块nn.Sequential is an ordered container of modules. The data is passed through all the modules in the same order as defined. You can use sequential containers to put together a quick network like seq_modules.
seq_modules = nn.Sequential(flatten,layer1,nn.ReLU(),nn.Linear(20, 10))input_image = torch.rand(3,28,28)logits = seq_modules(input_image)
