第一个模型：全连接网络
所有的模型都是可调用的，就像层一样
多输入和多输出模型
共享层
This layer can take as input a matrix
and will return a vector of size 64
When we reuse the same layer instance
multiple times, the weights of the layer
are also being reused
(it is effectively the same layer)
We can then concatenate the two vectors:
And add a logistic regression on top
We define a trainable model linking the
tweet inputs to the predictions
- 更多的例子

第一个模型：全连接网络

让我们从简单一点的模型开始。
Sequential当然是实现全连接网络的最好方式，但我们从简单的全连接网络开始，有助于我们学习这部分的内容。在开始前，有几个概念需要澄清：

层对象接受张量为参数，返回一个张量。
输入是张量，输出也是张量的一个框架就是一个模型，通过Model定义。
这样的模型可以被像Keras的Sequential一样被训练

例如：

from keras.layers import Input, Dense
from keras.models import Model
# This returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels)  # starts training

所有的模型都是可调用的，就像层一样

利用函数式模型的接口，我们可以很容易的重用已经训练好的模型：你可以把模型当作一个层一样，通过提供一个tensor来调用它。注意当你调用一个模型时，你不仅仅重用了它的结构，也重用了它的权重。

x = Input(shape=(784,))
# This works, and returns the 10-way softmax we defined above.
y = model(x)

这种方式可以允许你快速的创建能处理序列信号的模型，你可以很快将一个图像分类的模型变为一个对视频分类的模型，只需要一行代码：

from keras.layers import TimeDistributed
# Input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))
# This applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)

多输入和多输出模型

注意我们可以给任意层传递name参数来给层命名。之后模型编译和训练时可以用到。
详细例子见官方文档：https://keras-cn.readthedocs.io/en/latest/getting_started/functional_API/

共享层

另一个使用函数式模型的场合是使用共享层的时候。
考虑微博数据，我们希望建立模型来判别两条微博是否是来自同一个用户，这个需求同样可以用来判断一个用户的两条微博的相似性。
一种实现方式是，我们建立一个模型，它分别将两条微博的数据映射到两个特征向量上，然后将特征向量串联并加一个logistic回归层，输出它们来自同一个用户的概率。这种模型的训练数据是一对对的微博。
因为这个问题是对称的，所以处理第一条微博的模型当然也能重用于处理第二条微博。所以这里我们使用一个共享的LSTM层来进行映射。
例子：

首先，我们将微博的数据转为（140，256）的矩阵，即每条微博有140个字符，每个单词的特征由一个256维的词向量表示，向量的每个元素为1表示某个字符出现，为0表示不出现，这是一个one-hot编码。
- 之所以是（140，256）是因为一条微博最多有140个字符，而扩展的ASCII码表编码了常见的256个字符。原文中此处为Tweet，所以对外国人而言这是合理的。如果考虑中文字符，那一个单词的词向量就不止256了。
若要对不同的输入共享同一层，就初始化该层一次，然后多次调用它 ```python import keras from keras.layers import Input, LSTM, Dense from keras.models import Model

tweet_a = Input(shape=(140, 256)) tweet_b = Input(shape=(140, 256))

This layer can take as input a matrix

and will return a vector of size 64

shared_lstm = LSTM(64)

When we reuse the same layer instance

multiple times, the weights of the layer

are also being reused

(it is effectively the same layer)

encoded_a = shared_lstm(tweet_a) encoded_b = shared_lstm(tweet_b)

We can then concatenate the two vectors:

merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

And add a logistic regression on top

predictions = Dense(1, activation=’sigmoid’)(merged_vector)

We define a trainable model linking the

tweet inputs to the predictions

model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)

model.compile(optimizer=’rmsprop’, loss=’binary_crossentropy’, metrics=[‘accuracy’]) model.fit([data_a, data_b], labels, epochs=10)

:::info
先暂停一下，看看共享层到底输出了什么，它的输出数据shape又是什么
:::
<a name="n6rk4"></a>
#### 层“节点”的概念
- 无论何时，当你在某个输入上调用层时，你就创建了一个新的张量（即该层的输出），同时你也在为这个层增加一个“（计算）节点”。这个节点将输入张量映射为输出张量。当你多次调用该层时，这个层就有了多个节点，其下标分别为0，1，2...
- 在上一版本的Keras中，你可以通过layer.get_output()方法来获得层的输出张量，或者通过layer.output_shape获得其输出张量的shape。这个版本的Keras你仍然可以这么做（除了layer.get_output()被output替换）。但如果一个层与多个输入相连，会出现什么情况呢？
   - 如果层只与一个输入相连，那没有任何困惑的地方。.output将会返回该层唯一的输出
```python
a = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
assert lstm.output == encoded_a

但当层与多个输入相连时，会出现问题

a = Input(shape=(140, 256))
b = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
encoded_b = lstm(b)
lstm.output
# 这段代码会报错，如下：
>> AssertionError: Layer lstm_1 has multiple inbound nodes,
hence the notion of "layer output" is ill-defined.
Use `get_output_at(node_index)` instead.

通过下面这种调用方法解决：

assert lstm.get_output_at(0) == encoded_a
assert lstm.get_output_at(1) == encoded_b

对于input_shape和output_shape也是一样，如果一个层只有一个节点，或所有的节点都有相同的输入或输出shape，那么input_shape和output_shape都是没有歧义的，并也只返回一个值。但是，例如你把一个相同的Conv2D应用于一个大小为(32,32,3)的数据，然后又将其应用于一个(64,64,3)的数据，那么此时该层就具有了多个输入和输出的shape，你就需要显式的指定节点的下标，来表明你想取的是哪个了

a = Input(shape=(32, 32, 3))
b = Input(shape=(64, 64, 3))
conv = Conv2D(16, (3, 3), padding='same')
conved_a = conv(a)
# Only one input so far, the following will work:
assert conv.input_shape == (None, 32, 32, 3)
conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:
assert conv.get_input_shape_at(0) == (None, 32, 32, 3)
assert conv.get_input_shape_at(1) == (None, 64, 64, 3)

日常学习

Functional model