Part 8-1 神经网络模型展示 - 《Machine Learning学习笔记》

模型展示——1<br />让我们来看看我们将如何使用神经网络来表示假设函数。 在一个非常简单的层面上，神经元基本上是将输入（树突）作为电输入（称为“尖峰”）的计算单元，这些输入被引导到输出（轴突）。 在我们的模型中，我们的树突就像输入特征，输出是我们假设函数的结果。 在这个模型中，我们的输入节点有时被称为“偏置单元”。 它总是等于 1。在神经网络中，我们使用与分类相同的逻辑函数，![](https://cdn.nlark.com/yuque/__latex/3cc62f6bf12d690298d35756518411a2.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=MsEcU&margin=%5Bobject%20Object%5D&originHeight=48&originWidth=82&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)，但我们有时称其为 sigmoid（逻辑）激活函数。 在这种情况下，我们的“theta”参数有时称为“权重”。<br />在视觉上，一个简单的表示看起来像：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584592991837-0c40db51-b815-40c3-b430-526a288c2daf.png#crop=0&crop=0&crop=1&crop=1&from=url&id=KxtRU&margin=%5Bobject%20Object%5D&originHeight=78&originWidth=849&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />我们的输入节点（第 1 层），也称为“输入层”，进入另一个节点（第 2 层），最终输出假设函数，称为“输出层”。<br />我们可以在输入层和输出层之间有中间节点层，称为“隐藏层”。在本例中，我们将这些中间或“隐藏”层节点标记为![](https://cdn.nlark.com/yuque/__latex/1a76f4d7eedcf6cf04688746c741fc4c.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=W2LiX&margin=%5Bobject%20Object%5D&originHeight=29&originWidth=82&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)并且称他们为“激活单元”。<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584593212267-a9b20f81-67c3-4559-ab41-644d3f976fd4.png#crop=0&crop=0&crop=1&crop=1&from=url&id=S9M9o&margin=%5Bobject%20Object%5D&originHeight=64&originWidth=846&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />如果我们有一个隐藏层，它看起来像：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584593231111-241143ff-bb33-41e1-9737-0b2e11e788c5.png#crop=0&crop=0&crop=1&crop=1&from=url&id=EjXXC&margin=%5Bobject%20Object%5D&originHeight=98&originWidth=848&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />每个“激活”节点的值按如下方式获得：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584593249966-0b9b0904-5782-46b8-9bee-f70ea18560d3.png#crop=0&crop=0&crop=1&crop=1&from=url&id=EsU5G&margin=%5Bobject%20Object%5D&originHeight=118&originWidth=844&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />这就是说我们通过使用 3×4 参数矩阵来计算我们的激活节点。 我们将每一行参数应用于我们的输入以获得一个激活节点的值。 我们的假设输出是应用于激活节点值总和的逻辑函数，这些值已乘以另一个包含![](https://cdn.nlark.com/yuque/__latex/ccc61ccb3eb94520e8b770edf542378b.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=q3SzI&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=33&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)第二层节点权重的参数矩阵。每层都有自己的权重矩阵， ![](https://cdn.nlark.com/yuque/__latex/051a2760f99796a2e07758e2620392ca.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=fXjEb&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=32&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。<br />这些权重矩阵的维度确定如下：<br />如果网络在第 j 层有![](https://cdn.nlark.com/yuque/__latex/14730a16e7ffd22066fda608764658c0.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=pAtjF&margin=%5Bobject%20Object%5D&originHeight=19&originWidth=16&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)单元，在第 j+1 层有![](https://cdn.nlark.com/yuque/__latex/bb5fd62878b7cdec356b340e1321392b.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=WSlwR&margin=%5Bobject%20Object%5D&originHeight=19&originWidth=33&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)单元，则![](https://cdn.nlark.com/yuque/__latex/051a2760f99796a2e07758e2620392ca.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=Ihclu&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=32&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)维度为![](https://cdn.nlark.com/yuque/__latex/6f3ea499376ec42a8096cdbc51943d11.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=NkMTD&margin=%5Bobject%20Object%5D&originHeight=24&originWidth=118&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 。<br />+1 来自![](https://cdn.nlark.com/yuque/__latex/051a2760f99796a2e07758e2620392ca.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=PIfeO&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=32&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)“偏置节点”![](https://cdn.nlark.com/yuque/__latex/3e0d691f3a530e6c7e079636f20c111b.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=DzJHj&margin=%5Bobject%20Object%5D&originHeight=16&originWidth=19&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)和 ![](https://cdn.nlark.com/yuque/__latex/757fa5b18faf8a9db42a564ffd08d5e1.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=bWPu9&margin=%5Bobject%20Object%5D&originHeight=29&originWidth=32&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)的添加。 换句话说，输出节点将不包括偏置节点，而输入将包括。 下图总结了我们的模型表示：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584594733448-2862b250-97c0-4409-be50-df585a461456.png#crop=0&crop=0&crop=1&crop=1&from=url&id=InVdU&margin=%5Bobject%20Object%5D&originHeight=427&originWidth=842&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />![捕获.PNG](https://cdn.nlark.com/yuque/0/2021/png/22415994/1630206506329-13fcc727-26da-4794-bca0-1486e77a75c6.png#clientId=ue4164305-4ecd-4&crop=0&crop=0&crop=1&crop=1&from=ui&id=u9668b30d&margin=%5Bobject%20Object%5D&name=%E6%8D%95%E8%8E%B7.PNG&originHeight=91&originWidth=957&originalType=binary&ratio=1&rotation=0&showTitle=false&size=11168&status=done&style=none&taskId=u17fe95f8-5e5f-4a1d-8ea2-625a5abc962&title=)<br /> <br />模型展示——2<br />重申一遍，下面是一个神经网络的例子：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584601142833-135331b7-af45-41f8-8440-8c75d77705b7.png#crop=0&crop=0&crop=1&crop=1&from=url&id=xuVMT&margin=%5Bobject%20Object%5D&originHeight=122&originWidth=844&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />在本节中，我们将对上述函数进行矢量化实现。 我们将定义一个包含 g 函数内部参数的新变量![](https://cdn.nlark.com/yuque/__latex/210d9c02ac501ae00ebe288011858f7c.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=Pz0Jy&margin=%5Bobject%20Object%5D&originHeight=29&originWidth=26&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 在我们之前的示例中，如果我们将所有参数替换为变量 z，我们将得到：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584602115241-cfea867b-0ac1-434a-bf4e-377a34f5905e.png#crop=0&crop=0&crop=1&crop=1&from=url&id=gR7Sg&margin=%5Bobject%20Object%5D&originHeight=95&originWidth=847&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />换句话说，对于层 j=2 和节点 k，变量 z 将是：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584602153979-b01d8bfc-1607-4c1a-9cb0-6bb35c1c9a8e.png#crop=0&crop=0&crop=1&crop=1&from=url&id=NJq3V&margin=%5Bobject%20Object%5D&originHeight=47&originWidth=853&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />设置 ![](https://cdn.nlark.com/yuque/__latex/daa24768a5c4ab7e627880c135fc4725.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=hZU79&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=64&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)，我们可以将方程改写为：<br />![](https://cdn.nlark.com/yuque/0/2020/png/1077334/1584602276969-9b4253d2-9cf5-4d26-b49b-9a40f12302c3.png#crop=0&crop=0&crop=1&crop=1&from=url&id=CLn0F&margin=%5Bobject%20Object%5D&originHeight=44&originWidth=846&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />    我们将矩阵![](https://cdn.nlark.com/yuque/__latex/04a91212015fe8f420d28a465f2aa2ff.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=ULAFR&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=49&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)，其维度是![](https://cdn.nlark.com/yuque/__latex/cc76e43aec3c8b808a15bcdd9a474508.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=kAfgo&margin=%5Bobject%20Object%5D&originHeight=24&originWidth=96&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)（其中![](https://cdn.nlark.com/yuque/__latex/14730a16e7ffd22066fda608764658c0.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=JBKaE&margin=%5Bobject%20Object%5D&originHeight=19&originWidth=16&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)是激活节点的数量）乘以高度为 (n+1) 的向量![](https://cdn.nlark.com/yuque/__latex/50bcce59378396724cdf5b6d5c5bfd76.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=aQWpD&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=44&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。这给了我们带有高度![](https://cdn.nlark.com/yuque/__latex/14730a16e7ffd22066fda608764658c0.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=iP1Sf&margin=%5Bobject%20Object%5D&originHeight=19&originWidth=16&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)的向量![](https://cdn.nlark.com/yuque/__latex/394b027de818761aa8b46f03804b5c45.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=YYSIA&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=26&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。现在我们可以获得层 j 的激活节点向量，如下所示：<br />       ![](https://cdn.nlark.com/yuque/__latex/4b3f2bd0d8456ff0edb8ad7c068b1fa0.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=hK1lC&margin=%5Bobject%20Object%5D&originHeight=27&originWidth=102&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />我们的函数 g 可以按元素应用于我们的向量![](https://cdn.nlark.com/yuque/__latex/394b027de818761aa8b46f03804b5c45.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=AUnoq&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=26&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。<br />然后，我们可以在计算![](https://cdn.nlark.com/yuque/__latex/dc9c83a3beda1f80e29f0eff8490b819.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=yehVI&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=27&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 后向第 j 层添加一个偏置单元（等于 1）。 这将是元素![](https://cdn.nlark.com/yuque/__latex/2b30e503c9aee189d972f3dc23bf7519.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=AaMg0&margin=%5Bobject%20Object%5D&originHeight=29&originWidth=27&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)并且等于 1。为了计算我们的最终假设，让我们首先计算另一个 z 向量：<br />![](https://cdn.nlark.com/yuque/__latex/b7d251ed7828974aba8dad6e8c03a273.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=h4Tua&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=127&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />我们通过将![](https://cdn.nlark.com/yuque/__latex/04a91212015fe8f420d28a465f2aa2ff.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=fewsR&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=49&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)下一个 theta 矩阵与我们刚刚得到的所有激活节点的值相乘来得到这个最终的 z 向量。 最后一个 theta 矩阵![](https://cdn.nlark.com/yuque/__latex/1e5c33cbe122b6abbe043b9b35abdd74.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=p8lVt&margin=%5Bobject%20Object%5D&originHeight=23&originWidth=32&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)将只有一行乘以一列，因此我们的结果是一个数字。 然后我们得到我们的最终结果：![](https://cdn.nlark.com/yuque/__latex/82e367a6ae4c0ffbe1dbda0627aa5ae6.svg#crop=0&crop=0&crop=1&crop=1&from=url&id=ccZny&margin=%5Bobject%20Object%5D&originHeight=27&originWidth=208&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)<br />请注意，在最后一步中，在第 j 层和第 j+1 层之间，我们正在做与逻辑回归中完全相同的事情。 在神经网络中添加所有这些中间层使我们能够更优雅地产生有趣且更复杂的非线性假设。