• Keras 模型由多个组件组成:
    • 架构或配置,它指定模型包含的层及其连接方式
    • 一组权重值(即“模型的状态”)
    • 优化器(通过编译模型来定义)
    • 一组损失和指标(通过编译模型或通过调用 add_loss()add_metric() 来定义)
  • 可以通过 Keras API 将这些片段一次性保存到磁盘,或仅选择性地保存其中一些片段:
    • 将所有内容以 TensorFlow SavedModel 格式(或较早的 Keras H5 格式)保存到单个归档。这是标准做法。
    • 仅保存架构/配置,通常保存为 JSON 文件
    • 仅保存权重值。通常在训练模型时使用
  • 保存 Keras 模型:

    1. model = ... # Get model (Sequential, Functional Model, or Model subclass)
    2. model.save('path/to/location')
  • 将模型加载回来: ```python from tensorflow import keras

model = keras.models.load_model(‘path/to/location’)

  1. <a name="j8kaU"></a>
  2. # 1、设置(导入待使用的模块包)
  3. ```python
  4. import numpy as np
  5. import tensorflow as tf
  6. from tensorflow import keras

2、保存和加载整个模型

  • 可以将整个模型保存到单个工件中。它将包括:

    • 模型的架构/配置
    • 模型的权重值(在训练过程中学习)
    • 模型的编译信息(如果调用了 compile()
    • 优化器及其状态(如果有的话,使得可以从上次中断的位置重新开始训练)

      (1)API

  • model.save()tf.keras.models.save_model()

  • tf.keras.models.load_model()
  • 可以使用两种格式将整个模型保存到磁盘:
    • TensorFlow SavedModel 格式(推荐,是使用 model.save() 时的默认格式
    • 较早的 Keras H5 格式
  • 可以通过以下方式切换到 H5 格式:
    • save_format='h5' 传递给 save()
    • 将以 .h5.keras 结尾的文件名传递给 save()

      (2)SavedModel 格式

      ```python def get_model():

      Create a simple model.

      inputs = keras.Input(shape=(32,)) outputs = keras.layers.Dense(1)(inputs) model = keras.Model(inputs, outputs) model.compile(optimizer=”adam”, loss=”mean_squared_error”) return model

model = get_model()

Train the model.

test_input = np.random.random((128, 32)) test_target = np.random.random((128, 1)) model.fit(test_input, test_target)

jy: 会创建一个名为 my_model 的文件夹, 其包含以下内容:

assets

saved_model.pb: 存储模型架构和训练配置(包括优化器、损失和指标)

variables: 目录, 用于保存权重

Calling save('my_model') creates a SavedModel folder my_model.

model.save(“my_model”)

It can be used to reconstruct the model identically.

reconstructed_model = keras.models.load_model(“my_model”)

Let’s check:

np.testing.assert_allclose( model.predict(test_input), reconstructed_model.predict(test_input) )

The reconstructed model is already compiled and has retained the optimizer

state, so training can resume:

reconstructed_model.fit(test_input, test_target)

“”” 4/4 [==============================] - 0s 1ms/step - loss: 0.3874 WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Layer.updates (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. INFO:tensorflow:Assets written to: my_model/assets 4/4 [==============================] - 0s 1ms/step - loss: 0.3378

“””


-  SavedModel 格式的详细信息
   - [https://tensorflow.google.cn/guide/saved_model#the_savedmodel_format_on_disk](https://tensorflow.google.cn/guide/saved_model#the_savedmodel_format_on_disk)
<a name="H8PxU"></a>
#### SavedModel 处理自定义对象的方式

- 保存模型和模型的层时,SavedModel 格式会存储类名称、**调用函数**、损失和权重(如果已实现,则还包括配置)。调用函数会定义模型/层的计算图。
- 如果没有模型/层配置,调用函数会被用来创建一个与原始模型类似的模型,该模型可以被训练、评估和用于推断。
- 尽管如此,在编写自定义模型或层类时,对 `get_config` 和 `from_config` 方法进行定义始终是一种好的做法。这样您就可以稍后根据需要轻松更新计算。
- 以下示例演示了在**没有**重写配置方法的情况下,从 SavedModel 格式加载自定义层所发生的情况。
```python
class CustomModel(keras.Model):
    def __init__(self, hidden_units):
        super(CustomModel, self).__init__()
        self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]

    def call(self, inputs):
        x = inputs
        for layer in self.dense_layers:
            x = layer(x)
        return x


model = CustomModel([16, 16, 10])
# Build the model by calling it
input_arr = tf.random.uniform((1, 5))
outputs = model(input_arr)
model.save("my_model")

# Delete the custom-defined model class to ensure that the loader does not have
# access to it.
del CustomModel

loaded = keras.models.load_model("my_model")
np.testing.assert_allclose(loaded(input_arr), outputs)

print("Original model:", model)
print("Loaded model:", loaded)

"""
INFO:tensorflow:Assets written to: my_model/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Original model: <__main__.CustomModel object at 0x7f2cec175fd0>
Loaded model: <tensorflow.python.keras.saving.saved_model.load.CustomModel object at 0x7f2d6ce80a58>
"""
  • 如上例所示,加载器动态地创建了一个与原始模型行为类似的新模型。

    (3)Keras H5 格式

  • Keras 还支持保存单个 HDF5 文件,其中包含模型的架构、权重值和 compile() 信息。它是 SavedModel 的轻量化替代选择。 ```python model = get_model()

Train the model.

test_input = np.random.random((128, 32)) test_target = np.random.random((128, 1)) model.fit(test_input, test_target)

Calling save('my_model.h5') creates a h5 file my_model.h5.

model.save(“my_h5_model.h5”)

It can be used to reconstruct the model identically.

reconstructed_model = keras.models.load_model(“my_h5_model.h5”)

Let’s check:

np.testing.assert_allclose( model.predict(test_input), reconstructed_model.predict(test_input) )

The reconstructed model is already compiled and has retained the optimizer

state, so training can resume:

reconstructed_model.fit(test_input, test_target)

“”” 4/4 [==============================] - 0s 1ms/step - loss: 0.3613 4/4 [==============================] - 0s 1ms/step - loss: 0.3248

“””

<a name="Ds3rA"></a>
#### 限制

- 与 SavedModel 格式相比,H5 文件不包括以下两方面内容:
   - 通过 `model.add_loss()` 和 `model.add_metric()` 添加的**外部损失和指标**不会被保存(这与 SavedModel 不同)。如果模型有此类损失和指标且想要恢复训练,则需要在加载模型后自行重新添加这些损失。
      - 注意:这不适用于通过 `self.add_loss()` 和 `self.add_metric()` 在层内创建的损失/指标。只要该层被加载,这些损失和指标就会被保留,因为它们是该层 `call` 方法的一部分。
   - 已保存的文件中不包含**自定义对象(如自定义层)的计算图**。在加载时,Keras 需要访问这些对象的 Python 类/函数以重建模型。
<a name="OzaYg"></a>
# 3、保存架构

- 模型的配置(或架构)指定模型包含的层,以及这些层的连接方式*。如果有模型的配置,则可以使用权重的新初始化状态创建模型,而无需编译信息。
   - 注意:这仅适用于使用函数式或序列式 API 定义的模型,不适用于子类化模型。
<a name="gWDEM"></a>
## (1)序贯模型或函数式 API 模型的配置

- 这些类型的模型是显式的层计算图:它们的配置始终以结构化形式提供。
- API
   - `get_config()` 和 `from_config()`
   - `tf.keras.models.model_to_json()` 和 `tf.keras.models.model_from_json()`
<a name="K9H7b"></a>
### (a)`get_config()` 和 `from_config()`

- 调用 `config = model.get_config()` 将返回一个包含模型配置的 Python 字典。然后可以通过 `Sequential.from_config(config)`(针对 Sequential 模型)或 `Model.from_config(config)`(针对函数式 API 模型)重建同一模型。
- 相同的工作流也适用于任何可序列化的层。
- **层示例:**
```python
layer = keras.layers.Dense(3, activation="relu")
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)
  • 序贯模型示例:

    model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
    config = model.get_config()
    new_model = keras.Sequential.from_config(config)
    
  • 函数式模型示例:

    inputs = keras.Input((32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    config = model.get_config()
    new_model = keras.Model.from_config(config)
    

    (b)to_json()tf.keras.models.model_from_json()

  • get_configfrom_config 类似,不同之处在于它会将模型转换成 JSON 字符串,之后该字符串可以在没有原始模型类的情况下进行加载。它还特定于模型,不适用于层。示例:

    model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
    json_config = model.to_json()
    new_model = keras.models.model_from_json(json_config)
    

    (2)自定义对象

  • 模型和层

    • 子类化模型和层的架构在 __init__call 方法中进行定义。它们被视为 Python 字节码,无法将其序列化为兼容 JSON 的配置,可以尝试对字节码进行序列化(例如通过 pickle),但这样做极不安全,因为模型将无法在其他系统上进行加载。
    • 为了保存/加载带有自定义层的模型或子类化模型,应重写 get_configfrom_config(可选)方法。此外,应注册自定义对象,以便 Keras 能够感知它。
  • 自定义函数
    • 自定义函数(如激活损失或初始化)不需要 get_config 方法。只需将函数名称注册为自定义对象,就足以进行加载。
  • 仅加载 TensorFlow 计算图
    • 您可以加载由 Keras 生成的 TensorFlow 计算图。此类加载无需提供任何 custom_objects。可以执行以下代码进行加载: ```python model.save(“my_model”) tensorflow_graph = tf.saved_model.load(“my_model”) x = np.random.uniform(size=(4, 32)).astype(np.float32) predicted = tensorflow_graph(x).numpy()

“”” INFO:tensorflow:Assets written to: my_model/assets “””


- 注意:该方法有几个缺点:
   - 由于可追溯性原因,您应该始终可以访问所使用的自定义对象。您不会希望将无法重新创建的模型投入生产。
   - `tf.saved_model.load` 返回的对象不是 Keras 模型,因此不太容易使用。
      - 如:无法访问 `.predict()` 或 `.fit()`。
- 虽然不鼓励使用此方法,但当遇到棘手问题(如,您丢失了自定义对象的代码,或在使用 `tf.keras.models.load_model()` 加载模型时遇到了问题)时,它还是能够提供帮助。
   - `tf.saved_model.load`:[https://tensorflow.google.cn/api_docs/python/tf/saved_model/load](https://tensorflow.google.cn/api_docs/python/tf/saved_model/load)
<a name="UOOO8"></a>
### (a)定义配置方法

   - `get_config` 应该返回一个 JSON 可序列化字典,以便兼容 Keras 节省架构和模型的 API。
   - `from_config(config)`(类方法)应返回一个根据配置创建的新层或新模型对象。默认实现返回 `cls(**config)`。
   - **示例:**
```python
class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name="var_a")

    def call(self, inputs, training=False):
        if training:
            return inputs * self.var
        else:
            return inputs

    def get_config(self):
        return {"a": self.var.numpy()}

    # There's actually no need to define `from_config` here, since returning
    # `cls(**config)` is the default behavior.
    @classmethod
    def from_config(cls, config):
        return cls(**config)


layer = CustomLayer(5)
layer.var.assign(2)

serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(
    serialized_layer, custom_objects={"CustomLayer": CustomLayer}
)

(b)注册自定义对象

  • Keras 会对生成了配置的类进行记录。上例中tf.keras.layers.serialize 生成了自定义层的序列化形式:{'class_name': 'CustomLayer', 'config': {'a': 2} }
  • Keras 会保留所有内置的层、模型、优化器和指标的主列表,用于查找正确的类以调用 from_config。如果找不到该类,则会引发错误(Value Error: Unknown layer)。有几种方法可以将自定义类注册到此列表中:

    1. 在加载函数中设置 custom_objects 参数。(参考:“定义配置方法” 中说明)
    2. tf.keras.utils.custom_object_scopetf.keras.utils.CustomObjectScope
    3. tf.keras.utils.register_keras_serializable

      (c)自定义层和函数示例

      ```python class CustomLayer(keras.layers.Layer): def init(self, units=32, kwargs): super(CustomLayer, self).init(kwargs) self.units = units

    def build(self, input_shape): self.w = self.add_weight(

       shape=(input_shape[-1], self.units),
       initializer="random_normal",
       trainable=True,
    

    ) self.b = self.add_weight(

       shape=(self.units,), initializer="random_normal", trainable=True
    

    )

    def call(self, inputs): return tf.matmul(inputs, self.w) + self.b

    def get_config(self): config = super(CustomLayer, self).get_config() config.update({“units”: self.units}) return config

def custom_activation(x): return tf.nn.tanh(x) ** 2

Make a model with the CustomLayer and custom_activation

inputs = keras.Input((32,)) x = CustomLayer(32)(inputs) outputs = keras.layers.Activation(custom_activation)(x) model = keras.Model(inputs, outputs)

Retrieve the config

config = model.get_config()

At loading time, register the custom objects with a custom_object_scope:

custom_objects = {“CustomLayer”: CustomLayer, “custom_activation”: custom_activation} with keras.utils.custom_object_scope(custom_objects): new_model = keras.Model.from_config(config)

<a name="oa3Lu"></a>
## (3)内存中模型克隆

- 可以通过 `tf.keras.models.clone_model()` 在内存中克隆模型。这相当于获取模型的配置,然后通过配置重建模型(因此它不会保留编译信息或层的权重值)。
- **示例:**
```python
with keras.utils.custom_object_scope(custom_objects):
    new_model = keras.models.clone_model(model)

4、仅保存和加载模型的权重值

  • 您可以选择仅保存和加载模型的权重。这可能对以下情况有用:

    • 只需使用模型进行推断:无需重新开始训练,因此不需要编译信息或优化器状态。
    • 正在进行迁移学习:需要重用先前模型的状态来训练新模型,因此不需要先前模型的编译信息。

      (1)用于内存中权重迁移的 API

  • 可以使用 get_weightsset_weights 在不同对象之间复制权重:

    • tf.keras.layers.Layer.get_weights():返回 Numpy 数组列表。
    • tf.keras.layers.Layer.set_weights():将模型权重设置为 weights 参数中的值。

      (a)在内存中将权重从一层转移到另一层

      ```python def create_layer(): layer = keras.layers.Dense(64, activation=”relu”, name=”dense_2”) layer.build((None, 784)) return layer

layer_1 = create_layer() layer_2 = create_layer()

Copy weights from layer 2 to layer 1

layer_2.set_weights(layer_1.get_weights())

<a name="fOlZ9"></a>
### (b)在内存中将权重从一个模型转移到具有兼容架构的另一个模型
```python
# Create a simple functional model
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
    def __init__(self, output_dim, name=None):
        super(SubclassedModel, self).__init__(name=name)
        self.output_dim = output_dim
        self.dense_1 = keras.layers.Dense(64, activation="relu", name="dense_1")
        self.dense_2 = keras.layers.Dense(64, activation="relu", name="dense_2")
        self.dense_3 = keras.layers.Dense(output_dim, name="predictions")

    def call(self, inputs):
        x = self.dense_1(inputs)
        x = self.dense_2(x)
        x = self.dense_3(x)
        return x

    def get_config(self):
        return {"output_dim": self.output_dim, "name": self.name}


subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))

# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())

assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

(c)无状态层的情况

  • 因为无状态层不会改变权重的顺序或数量,所以即便存在额外的/缺失的无状态层,模型也可以具有兼容架构。 ```python inputs = keras.Input(shape=(784,), name=”digits”) x = keras.layers.Dense(64, activation=”relu”, name=”dense_1”)(inputs) x = keras.layers.Dense(64, activation=”relu”, name=”dense_2”)(x) outputs = keras.layers.Dense(10, name=”predictions”)(x) functional_model = keras.Model(inputs=inputs, outputs=outputs, name=”3_layer_mlp”)

inputs = keras.Input(shape=(784,), name=”digits”) x = keras.layers.Dense(64, activation=”relu”, name=”dense_1”)(inputs) x = keras.layers.Dense(64, activation=”relu”, name=”dense_2”)(x)

Add a dropout layer, which does not contain any weights.

x = keras.layers.Dropout(0.5)(x) outputs = keras.layers.Dense(10, name=”predictions”)(x) functional_model_with_dropout = keras.Model( inputs=inputs, outputs=outputs, name=”3_layer_mlp” )

functional_model_with_dropout.set_weights(functional_model.get_weights())

<a name="RQfkl"></a>
## (2)用于将权重保存到磁盘并将其加载回来的 API

- 可以用以下格式调用 `model.save_weights`,将权重保存到磁盘:
   - TensorFlow 检查点(默认格式)
   - HDF5
- 可以通过以下两种方法来指定保存格式:
   1. `save_format` 参数:将值设置为 `save_format="tf"` 或 `save_format="h5"`。
   1. `path` 参数:如果路径以 `.h5` 或 `.hdf5` 结束,则使用 HDF5 格式。除非设置了 `save_format`,否则对于其他后缀,将使用 TensorFlow 检查点格式。
- 还可以选择将权重作为内存中的 Numpy 数组取回。每个 API 都有自己的优缺点,详情如下。
<a name="GUTSG"></a>
### (a)TF 检查点格式

- **示例:**
```python
# Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("ckpt")
load_status = sequential_model.load_weights("ckpt")

# `assert_consumed` can be used as validation that all variable values have been
# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other
# methods in the Status object.
load_status.assert_consumed()

"""
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2d6cb709e8>
"""
  • 格式详情
    • TensorFlow 检查点格式使用对象特性名称来保存和恢复权重。以 tf.keras.layers.Dense 层为例。该层包含两个权重:dense.kerneldense.bias。将层保存为 tf 格式后,生成的检查点会包含 "kernel""bias" 键及其对应的权重值。
    • 注意:特性/计算图边缘根据父对象中使用的名称而非变量的名称进行命名。
      • 示例(变量 CustomLayer.var 是将 "var" 而非 "var_a" 作为键的一部分来保存): ```python class CustomLayer(keras.layers.Layer): def init(self, a): self.var = tf.Variable(a, name=”var_a”)

layer = CustomLayer(5) layer_ckpt = tf.train.Checkpoint(layer=layer).save(“custom_layer”)

ckpt_reader = tf.train.load_checkpoint(layer_ckpt)

ckpt_reader.get_variable_to_dtype_map()

“”” {‘save_counter/.ATTRIBUTES/VARIABLE_VALUE’: tf.int64, ‘_CHECKPOINTABLE_OBJECT_GRAPH’: tf.string, ‘layer/var/.ATTRIBUTES/VARIABLE_VALUE’: tf.int32} “””

<a name="f4Qc0"></a>
#### 迁移学习示例

- 本质上,只要两个模型具有相同的架构,它们就可以共享同一个检查点。
```python
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(
    functional_model.inputs, functional_model.layers[-1].input, name="pretrained_model"
)
# Randomly assign "trained" weights.
for w in pretrained.weights:
    w.assign(tf.random.normal(w.shape))
pretrained.save_weights("pretrained_ckpt")
pretrained.summary()
"""
Model: "pretrained_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
digits (InputLayer)          [(None, 784)]             0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
=================================================================
Total params: 54,400
Trainable params: 54,400
Non-trainable params: 0
_________________________________________________________________
"""

# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(5, name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="new_model")

# Load the weights from pretrained_ckpt into model.
model.load_weights("pretrained_ckpt")

# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

print("\n", "-" * 50)
model.summary()
"""
--------------------------------------------------
Model: "new_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
digits (InputLayer)          [(None, 784)]             0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
_________________________________________________________________
predictions (Dense)          (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
"""


# Example 2: Sequential model
# Recreate the pretrained model, and load the saved weights.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
pretrained_model = keras.Model(inputs=inputs, outputs=x, name="pretrained")

# Sequential example:
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
model.summary()
"""
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
pretrained (Functional)      (None, 64)                54400     
_________________________________________________________________
predictions (Dense)          (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
"""

pretrained_model.load_weights("pretrained_ckpt")
"""
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2d6cb47080>
"""

# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,
# but will *not* work as expected. If you inspect the weights, you'll see that
# none of the weights will have loaded. `pretrained_model.load_weights()` is the
# correct method to call.
  • 通常建议使用相同的 API 来构建模型。如果在序贯模型和函数式模型之间、或在函数式模型和子类化模型等之间进行切换,请始终重新构建预训练模型并将预训练权重加载到该模型。
  • 如果模型架构截然不同,如何保存权重并将其加载到不同模型?
    • 使用 tf.train.Checkpoint 来保存和恢复确切的层/变量。示例: ```python

      Create a subclassed model that essentially uses functional_model’s first

      and last layers.

      First, save the weights of functional_model’s first and last dense layers.

      first_dense = functional_model.layers[1] last_dense = functional_model.layers[-1] ckpt_path = tf.train.Checkpoint( dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias ).save(“ckpt”)

Define the subclassed model.

class ContrivedModel(keras.Model): def init(self): super(ContrivedModel, self).init() self.first_dense = keras.layers.Dense(64) self.kernel = self.add_variable(“kernel”, shape=(64, 10)) self.bias = self.add_variable(“bias”, shape=(10,))

def call(self, inputs):
    x = self.first_dense(inputs)
    return tf.matmul(x, self.kernel) + self.bias

model = ContrivedModel()

Call model on inputs to create the variables of the dense layer.

_ = model(tf.ones((1, 784)))

Create a Checkpoint with the same structure as before, and load the weights.

tf.train.Checkpoint( dense=model.first_dense, kernel=model.kernel, bias=model.bias ).restore(ckpt_path).assert_consumed()

“”” WARNING:tensorflow:From :15: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.add_weight method instead.

“””

<a name="DIdVC"></a>
### (b)HDF5 格式

- HDF5 格式包含按层名称分组的权重。权重是通过将可训练权重列表与不可训练权重列表连接起来进行排序的列表(与 `layer.weights` 相同)。因此,如果模型的层和可训练状态与保存在检查点中的相同,则可以使用 HDF5 检查点。
- **示例:**
```python
# Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("weights.h5")
sequential_model.load_weights("weights.h5")
  • 注意:当模型包含嵌套层时,更改 layer.trainable 可能导致 layer.weights 的顺序不同。 ```python class NestedDenseLayer(keras.layers.Layer): def init(self, units, name=None):

      super(NestedDenseLayer, self).__init__(name=name)
      self.dense_1 = keras.layers.Dense(units, name="dense_1")
      self.dense_2 = keras.layers.Dense(units, name="dense_2")
    

    def call(self, inputs):

      return self.dense_2(self.dense_1(inputs))
    

nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, “nested”)]) variable_names = [v.name for v in nested_model.weights] print(“variables: {}”.format(variable_names))

print(“\nChanging trainable status of one of the nested layers…”) nested_model.get_layer(“nested”).dense_1.trainable = False

variable_names_2 = [v.name for v in nested_model.weights] print(“\nvariables: {}”.format(variable_names_2)) print(“variable ordering changed:”, variable_names != variable_names_2)

“”” variables: [‘nested/dense_1/kernel:0’, ‘nested/dense_1/bias:0’, ‘nested/dense_2/kernel:0’, ‘nested/dense_2/bias:0’]

Changing trainable status of one of the nested layers…

variables: [‘nested/dense_2/kernel:0’, ‘nested/dense_2/bias:0’, ‘nested/dense_1/kernel:0’, ‘nested/dense_1/bias:0’] variable ordering changed: True “””

<a name="qNOyw"></a>
#### 迁移学习示例

- 从 HDF5 加载预训练权重时,建议将权重加载到设置了检查点的原始模型中,然后将所需的权重/层提取到新模型中。
- **示例:**
```python
def create_functional_model():
    inputs = keras.Input(shape=(784,), name="digits")
    x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = keras.layers.Dense(10, name="predictions")(x)
    return keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")


functional_model = create_functional_model()
functional_model.save_weights("pretrained_weights.h5")

# In a separate program:
pretrained_model = create_functional_model()
pretrained_model.load_weights("pretrained_weights.h5")

# Create a new model by extracting layers from the original model:
extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name="dense_3"))
model = keras.Sequential(extracted_layers)
model.summary()
"""
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
"""