保存与加载模型

TensorFlow HDF5

TF2.0 TensorFlow 2 / 2.0 中文文档:保存与加载模型 Save and Restore model

主要内容:使用 tf.keras接口训练、保存、加载模型,数据集选用 MNIST 。

  1. $ pip install -q tensorflow==2.0.0-beta1
  2. $ pip install -q h5py pyyaml

准备训练数据

  1. import tensorflow as tf
  2. from tensorflow import keras
  3. from tensorflow.keras import datasets, layers, models, callbacks
  4. from tensorflow.keras.datasets import mnist
  5. import os
  6. file_path = os.path.abspath('./mnist.npz')
  7. (train_x, train_y), (test_x, test_y) = datasets.mnist.load_data(path=file_path)
  8. train_y, test_y = train_y[:1000], test_y[:1000]
  9. train_x = train_x[:1000].reshape(-1, 28 * 28) / 255.0
  10. test_x = test_x[:1000].reshape(-1, 28 * 28) / 255.0

搭建模型

  1. def create_model():
  2. model = models.Sequential([
  3. layers.Dense(512, activation='relu', input_shape=(784,)),
  4. layers.Dropout(0.2),
  5. layers.Dense(10, activation='softmax')
  6. ])
  7. model.compile(optimizer='adam', metrics=['accuracy'],
  8. loss='sparse_categorical_crossentropy')
  9. return model
  10. def evaluate(target_model):
  11. _, acc = target_model.evaluate(test_x, test_y)
  12. print("Restore model, accuracy: {:5.2f}%".format(100*acc))

自动保存 checkpoints

这样做,一是训练结束后得到了训练好的模型,使用得不必再重新训练,二是训练过程被中断,可以从断点处继续训练。

设置tf.keras.callbacks.ModelCheckpoint回调可以实现这一点。

  1. # 存储模型的文件名,语法与 str.format 一致
  2. # period=10:每 10 epochs 保存一次
  3. checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
  4. checkpoint_dir = os.path.dirname(checkpoint_path)
  5. cp_callback = callbacks.ModelCheckpoint(
  6. checkpoint_path, verbose=1, save_weights_only=True, period=10)
  7. model = create_model()
  8. model.save_weights(checkpoint_path.format(epoch=0))
  9. model.fit(train_x, train_y, epochs=50, callbacks=[cp_callback],
  10. validation_data=(test_x, test_y), verbose=0)
  1. Epoch 00010: saving model to training_2/cp-0010.ckpt
  2. Epoch 00020: saving model to training_2/cp-0020.ckpt
  3. Epoch 00030: saving model to training_2/cp-0030.ckpt
  4. Epoch 00040: saving model to training_2/cp-0040.ckpt
  5. Epoch 00050: saving model to training_2/cp-0050.ckpt

加载权重:

  1. latest = tf.train.latest_checkpoint(checkpoint_dir)
  2. # 'training_2/cp-0050.ckpt'
  3. model = create_model()
  4. model.load_weights(latest)
  5. evaluate(model)
  1. 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780
  2. Restore model, accuracy: 87.80%

手动保存权重

  1. # 手动保存权重
  2. model.save_weights('./checkpoints/mannul_checkpoint')
  3. model = create_model()
  4. model.load_weights('./checkpoints/mannul_checkpoint')
  5. evaluate(model)
  1. 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780
  2. Restore model, accuracy: 87.80%

保存整个模型

上面的示例仅仅保存了模型中的权重(weights),模型和优化器都可以一起保存,包括权重(weights)、模型配置(architecture)和优化器配置(optimizer configuration)。这样做的好处是,当你恢复模型时,完全不依赖于原来搭建模型的代码。

保存完整的模型有很多应用场景,比如在浏览器中使用 TensorFlow.js 加载运行,比如在移动设备上使用 TensorFlow Lite 加载运行。

HDF5

直接调用model.save即可保存为 HDF5 格式的文件。

  1. model.save('my_model.h5')

从 HDF5 中恢复完整的模型。

  1. new_model = models.load_model('my_model.h5')
  2. evaluate(new_model)
  1. 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780
  2. Restore model, accuracy: 87.80%

saved_model

保存为saved_model格式。

  1. import time
  2. saved_model_path = "./saved_models/{}".format(int(time.time()))
  3. tf.keras.experimental.export_saved_model(model, saved_model_path)

恢复模型并预测

  1. new_model = tf.keras.experimental.load_from_saved_model(saved_model_path)
  2. model.predict(test_x).shape
  1. (1000, 10)

saved_model格式的模型可以直接用来预测(predict),但是 saved_model 没有保存优化器配置,如果要使用evaluate方法,则需要先 compile。

  1. new_model.compile(optimizer=model.optimizer,
  2. loss='sparse_categorical_crossentropy',
  3. metrics=['accuracy'])
  4. evaluate(new_model)
  1. 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780
  2. Restore model, accuracy: 87.80%

最后

TensorFlow 中还有其他的方式可以保存模型。

返回文档首页

完整代码:Github - save_restore_model.ipynb 参考文档:Save and restore models

附 推荐