在使用LightningModule里定义train、validate和test都非常方便，但是有时候会在每一个batch完成后保存loss等其他的数据，在每一个epoch结束后进行操作。那就可以用on_test_batch_end()，on_test_epoch_end()、on_test_end()等等。

on_train_epoch_end()

在ModelHooks里的代码如下：

  def on_train_epoch_end(self) -> None:
        """Called in the training loop at the very end of the epoch.
        # 在train里，每结束epoch后被调用
        To access all batch outputs at the end of the epoch, either:
        1. Implement `training_epoch_end` in the LightningModule OR
        2. Cache data across steps on the attribute(s) of the `LightningModule` and access them in this hook
        """

在每一个epoch结束后，保存一个中间权重，不是模型权重，是最后将一个论文代码改为lightning遇到的
所以定义on_train_epoch_end()

   def on_train_epoch_end(self) -> None:
        torch.save(
           mid_weight , {}/keys_{}.pt'.format(self.kwargs['exp_dir'], self.current_epoch))

on_test_batch_end()

在ModelHooks里的代码如下：

 def on_test_batch_end(
        self, outputs: Optional[STEP_OUTPUT], batch: Any, batch_idx: int, dataloader_idx: int
    ) -> None:
        """Called in the test loop after the batch.
        在test的batch之后被调用
        Args:
            outputs: The outputs of test_step_end(test_step(x))
            batch: The batched data as it is returned by the test DataLoader.
            batch_idx: the index of the batch
            dataloader_idx: the index of the dataloader
        """

在modelInterface中定义自己的on_test_batch_end()

def test_step(self, batch, batch_index):
        imgs = batch
        outputs, feas, updated_feas, m_items_test, softmax_score_query, softmax_score_memory, compactness_loss = self.model(imgs, self.m_items_test, False)
        mse_imgs = torch.mean(self.tr_recon_loss((outputs[0]+1)/2, (imgs[0]+1)/2)).item()
        mse_feas = compactness_loss.item()
        self.point_sc = point_score(outputs, imgs)
        return feas, mse_imgs, mse_feas, m_items_test
def on_test_batch_end(self, outputs, batch, batch_idx, dataloader) -> None:
        # len(outputs) = 4,output里就是test_step的返回值 outputs:Tuple[Union[Tuple, List]]
        # 保存了两个值
        self.kwargs['psnr_list'].append(psnr(outputs[1]))
        self.kwargs['feature_distance_list'].append(outputs[2])

重点：

outputs中的值就是test_step()的返回值，使用时outputs[0], outputs[1]..
在定义on_test_batch_end()时，一定是跟ModelHooks里on_test_batch_end的参数一致。
on_test_epoch_end()
在test结束后，希望将loss保存到np文件里
在ModelHooks里的代码如下：
```
def on_test_end(self) -> None:
  """Called at the end of testing."""
```

定义自己的on_test_end()

def on_test_end(self) -> None:
        np.save(psnr_path, self.kwargs['psnr_list'])
        np.save(feadis_path, self.kwargs['feature_distance_list'])

重点：

在定义epoch_end此类函数时，要定义好是train_epoch_end还是test_epoch_end,因为还有一个on_peoch_end。

on_train_start()

在训练模型开始前，想要初始化一个参数，
在ModelHooks里的on_train_start()代码如下：

    def on_train_start(self) -> None:
        """Called at the beginning of training after sanity check."""

 def on_train_start(self) -> None:
     self.weight = F.normalize(
            torch.rand(256,256), 
            dtype=torch.float),dim=1)
def training_step(self, batch):
    # 将self.weight放到与batch相同的设备上，否则会报错提示参数在多个设个设备上
    self.weight = self.weight.type_as(batch)

重点：

在train之前定义的参数一定要放到与模型同设备上。因为pl.LightningModule在init()阶段self.device还是cpu，当进入了training_step()之后，就迅速变为了cuda。
将参数放到与模型或其他的数据相同的设备的方法：用type_as函数将在模型中生成的tensor都放到和这个参考变量相同的device上即可。

其他方法用到再更新