1、Saving and Loading Model Weights
we do not specify pretrained=True, i.e. do not load default weights
be sure to call model.eval() method before inferencing to set the dropout and batch
normalization layers to evaluation mode. Failing to do this will yield inconsistent
inference results.
3、Saving and Loading（resuming training）a General Checkpoint
Additional information
4、Saving and Loading Models across Devices
- 1）Import all necessary libraries for loading our data
- 2）Define and intialize the neural network
  - （2）Save on a GPU, load on a GPU
Save
Load
- 4）Saving and loading DataParallel models
Save
Load to whatever device you want
- 2）Define and intialize the neural network
- 4）Save multiple models
Specify a path to save to

In this section we will look at how to persist model state with saving, loading and running model predictions.
```
import torch
import torchvision.models as models
```
1、Saving and Loading Model Weights
PyTorch models store the learned parameters in an internal state dictionary, called state_dict. These can be persisted via the torch.save method:
A common way to save a model is to serialize the internal state dictionary (containing the model parameters).
```
model = models.vgg16(pretrained=True)
torch.save(model.state_dict(), 'model_weights.pth')
```
To load model weights, you need to create an instance of the same model first, and then load the parameters using load_state_dict() method. ```python

we do not specify pretrained=True, i.e. do not load default weights
model = models.vgg16() model.load_state_dict(torch.load(‘model_weights.pth’))

be sure to call model.eval() method before inferencing to set the dropout and batch

normalization layers to evaluation mode. Failing to do this will yield inconsistent

inference results.

model.eval()

<a name="IIr2h"></a>
# 2、Saving and Loading Models with Shapes
- When loading model weights, we needed to instantiate the model class first, because the class defines the structure of a network. 
- We might want to save the structure of this class together with the model, in which case we can pass `model` (and not `model.state_dict()`) to the saving function:
```python
torch.save(model, 'model.pth')

We can then load the model like this:
```
model = torch.load('model.pth')
```
- This approach uses Python pickle module when serializing the model, thus it relies on the actual class definition to be available when loading the model.
  3、Saving and Loading（resuming training）a General Checkpoint
Saving and loading a general checkpoint model for inference or resuming training can be helpful for picking up where you last left off.
When saving a general checkpoint, you must save more than just the model’s state_dict. It is important to also save the optimizer’s state_dict, as this contains buffers and parameters that are updated as the model trains. Other items that you may want to save are the epoch you left off on, the latest recorded training loss, external torch.nn.Embedding layers, and more, based on your own algorithm.
- To save multiple checkpoints, you must organize them in a dictionary and use torch.save() to serialize the dictionary. A common PyTorch convention is to save these checkpoints using the .tar file extension.
- To load the items, first initialize the model and optimizer, then load the dictionary locally using torch.load(). From here, you can easily access the saved items by simply querying the dictionary as you would expect.
steps to save and load multiple checkpoints are shown below.

1）Import all necessary libraries for loading our data
```
import torch
import torch.nn as nn
import torch.optim as optim
```
2）Define and initialize the neural network

For sake of example, we will create a neural network for training images. ```python class Net(nn.Module): def init(self):

  super(Net, self).__init__()
  self.conv1 = nn.Conv2d(3, 6, 5)
  self.pool = nn.MaxPool2d(2, 2)
  self.conv2 = nn.Conv2d(6, 16, 5)
  self.fc1 = nn.Linear(16 * 5 * 5, 120)
  self.fc2 = nn.Linear(120, 84)
  self.fc3 = nn.Linear(84, 10)

def forward(self, x):

  x = self.pool(F.relu(self.conv1(x)))
  x = self.pool(F.relu(self.conv2(x)))
  x = x.view(-1, 16 * 5 * 5)
  x = F.relu(self.fc1(x))
  x = F.relu(self.fc2(x))
  x = self.fc3(x)
  return x

net = Net() print(net)

<a name="lEpFy"></a>
## 3）Initialize the optimizer

- We will use SGD with momentum.
```python
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

4）Save the general checkpoint

Collect all relevant information and build your dictionary. ```python
Additional information
EPOCH = 5 PATH = “model.pt” LOSS = 0.4

torch.save({ ‘epoch’: EPOCH, ‘model_state_dict’: net.state_dict(), ‘optimizer_state_dict’: optimizer.state_dict(), ‘loss’: LOSS, }, PATH)

<a name="et8mK"></a>
## 5）Load the general checkpoint

- Remember to first initialize the model and optimizer, then load the dictionary locally.
- You must call `model.eval()` to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.
- If you wish to resuming training, call `model.train()` to ensure these layers are in training mode.
```python
model = Net()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

model.eval()
# - or -
model.train()

4、Saving and Loading Models across Devices

There may be instances where you want to save and load your neural networks across different devices. Saving and loading models across devices is relatively straightforward using PyTorch.
In this recipe, we will experiment with saving and loading models across CPUs and GPUs. Steps are shown below.

1）Import all necessary libraries for loading our data
```
import torch
import torch.nn as nn
import torch.optim as optim
```
2）Define and intialize the neural network

For sake of example, we will create a neural network for training images. ```python class Net(nn.Module): def init(self):

  super(Net, self).__init__()
  self.conv1 = nn.Conv2d(3, 6, 5)
  self.pool = nn.MaxPool2d(2, 2)
  self.conv2 = nn.Conv2d(6, 16, 5)
  self.fc1 = nn.Linear(16 * 5 * 5, 120)
  self.fc2 = nn.Linear(120, 84)
  self.fc3 = nn.Linear(84, 10)

def forward(self, x):

  x = self.pool(F.relu(self.conv1(x)))
  x = self.pool(F.relu(self.conv2(x)))
  x = x.view(-1, 16 * 5 * 5)
  x = F.relu(self.fc1(x))
  x = F.relu(self.fc2(x))
  x = self.fc3(x)
  return x

net = Net() print(net)

<a name="zWBNE"></a>
## 3）Save and Load on CPU/GPU
<a name="lF5J8"></a>
### （1）Save on a GPU, load on a CPU

- When loading a model on a CPU that was trained with a GPU, pass `torch.device('cpu')` to the `map_location` argument in the `torch.load()` function.
```python
# Specify a path to save to
PATH = "model.pt"

# Save
torch.save(net.state_dict(), PATH)

# Load
device = torch.device('cpu')
model = Net()
# the storages underlying the tensors are dynamically remapped to the CPU device 
# using the ``map_location`` argument.
model.load_state_dict(torch.load(PATH, map_location=device))

（2）Save on a GPU, load on a GPU

When loading a model on a GPU that was trained and saved on GPU, simply convert the initialized model to a CUDA optimized model using model.to(torch.device('cuda')).
Be sure to use the .to(torch.device('cuda')) function on all model inputs to prepare the data for the model.
Note that calling my_tensor.to(device) returns a new copy of my_tensor on GPU. It does NOT overwrite my_tensor. Therefore, remember to manually overwrite tensors: my_tensor = my_tensor.to(torch.device('cuda')). ```python
Save
torch.save(net.state_dict(), PATH)

Load

device = torch.device(“cuda”) model = Net() model.load_state_dict(torch.load(PATH)) model.to(device)

<a name="nRet2"></a>
### （3）Save on a CPU, load on a GPU

- When loading a model on a GPU that was trained and saved on CPU, set the `map_location` argument in the `torch.load()` function to `cuda:device_id`. This loads the model to a given GPU device.
- Be sure to call `model.to(torch.device('cuda'))` to convert the model’s parameter tensors to CUDA tensors.
- Finally, also be sure to use the `.to(torch.device('cuda'))` function on all model inputs to prepare the data for the CUDA optimized model.
```python
# Save
torch.save(net.state_dict(), PATH)

# Load
device = torch.device("cuda")
model = Net()
# Choose whatever GPU device number you want
model.load_state_dict(torch.load(PATH, map_location="cuda:0"))
# Make sure to call input = input.to(device) on any input tensors that you feed to the model
model.to(device)

4）Saving and loading `DataParallel` models

torch.nn.DataParallel is a model wrapper that enables parallel GPU utilization.
To save a DataParallel model generically, save the model.module.state_dict(). This way, you have the flexibility to load the model any way you want to any device you want. ```python
Save
torch.save(net.module.state_dict(), PATH)

Load to whatever device you want

<a name="RDHX3"></a>
# 5、Saving and Loading Multiple Models in one File

- Saving and loading multiple models can be helpful for reusing models that you have previously trained.
- When saving a model comprised of multiple `torch.nn.Modules`, such as a GAN, a sequence-to-sequence model, or an ensemble of models, you must save a dictionary of each model’s state_dict and corresponding optimizer. You can also save any other items that may aid you in resuming training by simply appending them to the dictionary. 
- To load the models, first initialize the models and optimizers, then load the dictionary locally using `torch.load()`. From here, you can easily access the saved items by simply querying the dictionary as you would expect. 
- In this recipe, we will demonstrate how to save multiple models to one file. Setps shown below.
<a name="zFWbO"></a>
## 1）Import all necessary libraries for loading our data
```python
import torch
import torch.nn as nn
import torch.optim as optim

2）Define and intialize the neural network

For sake of example, we will create a neural network for training images. ```python class Net(nn.Module): def init(self):

  super(Net, self).__init__()
  self.conv1 = nn.Conv2d(3, 6, 5)
  self.pool = nn.MaxPool2d(2, 2)
  self.conv2 = nn.Conv2d(6, 16, 5)
  self.fc1 = nn.Linear(16 * 5 * 5, 120)
  self.fc2 = nn.Linear(120, 84)
  self.fc3 = nn.Linear(84, 10)

def forward(self, x):

  x = self.pool(F.relu(self.conv1(x)))
  x = self.pool(F.relu(self.conv2(x)))
  x = x.view(-1, 16 * 5 * 5)
  x = F.relu(self.fc1(x))
  x = F.relu(self.fc2(x))
  x = self.fc3(x)
  return x

netA = Net() netB = Net()

<a name="cHNb2"></a>
## 3）Initialize the optimizer

- We will use SGD with momentum to build an optimizer for each model we created.
```python
optimizerA = optim.SGD(netA.parameters(), lr=0.001, momentum=0.9)
optimizerB = optim.SGD(netB.parameters(), lr=0.001, momentum=0.9)

4）Save multiple models

Collect all relevant information and build your dictionary. ```python
Specify a path to save to
PATH = “model.pt”

torch.save({ ‘modelA_state_dict’: netA.state_dict(), ‘modelB_state_dict’: netB.state_dict(), ‘optimizerA_state_dict’: optimizerA.state_dict(), ‘optimizerB_state_dict’: optimizerB.state_dict(), }, PATH)

<a name="rnAIh"></a>
## 5）Load multiple models

- Remember to first initialize the models and optimizers, then load the dictionary locally.
- You must call `model.eval()` to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.
- If you wish to resuming training, call `model.train()` to ensure these layers are in training mode.
```python
modelA = Net()
modelB = Net()
optimizerA = optim.SGD(modelA.parameters(), lr=0.001, momentum=0.9)
optimizerB = optim.SGD(modelB.parameters(), lr=0.001, momentum=0.9)

checkpoint = torch.load(PATH)
modelA.load_state_dict(checkpoint['modelA_state_dict'])
modelB.load_state_dict(checkpoint['modelB_state_dict'])
optimizerA.load_state_dict(checkpoint['optimizerA_state_dict'])
optimizerB.load_state_dict(checkpoint['optimizerB_state_dict'])

modelA.eval()
modelB.eval()
# - or -
modelA.train()
modelB.train()

【03】机器学习、深度学习

（07）Save & Load Model

1、Saving and Loading Model Weights

we do not specify pretrained=True, i.e. do not load default weights

be sure to call model.eval() method before inferencing to set the dropout and batch

normalization layers to evaluation mode. Failing to do this will yield inconsistent

inference results.

3、Saving and Loading（resuming training）a General Checkpoint

1）Import all necessary libraries for loading our data

2）Define and initialize the neural network

4）Save the general checkpoint

Additional information

4、Saving and Loading Models across Devices

1）Import all necessary libraries for loading our data

2）Define and intialize the neural network

（2）Save on a GPU, load on a GPU

Save

Load

4）Saving and loading `DataParallel` models

Save

Load to whatever device you want

2）Define and intialize the neural network

4）Save multiple models

Specify a path to save to

（07）Save & Load Model

1、Saving and Loading Model Weights

we do not specify pretrained=True, i.e. do not load default weights

be sure to call model.eval() method before inferencing to set the dropout and batch

normalization layers to evaluation mode. Failing to do this will yield inconsistent

inference results.

3、Saving and Loading（resuming training）a General Checkpoint

1）Import all necessary libraries for loading our data

2）Define and initialize the neural network

4）Save the general checkpoint

Additional information

4、Saving and Loading Models across Devices

1）Import all necessary libraries for loading our data

2）Define and intialize the neural network

（2）Save on a GPU, load on a GPU

Save

Load

4）Saving and loading DataParallel models

Save

Load to whatever device you want

2）Define and intialize the neural network

4）Save multiple models

Specify a path to save to

4）Saving and loading `DataParallel` models