神经网络与深度学习笔记(番外)反向传播推导
回顾
我们设 为第 层的单元数
则它们的维数
%5C%5C%0A#card=math&code=w%5E%7B%5Bl%5D%7D%2C%20dw%20%EF%BC%9A%28n%5E%7B%5Bl%5D%7D%2Cn%5E%7B%5Bl-1%5D%7D%29%5C%5C%0A)
%5C%5C%0A#card=math&code=b%5E%7B%5Bl%5D%7D%2C%20db%20%EF%BC%9A%28n%5E%7B%5Bl%5D%7D%2C1%29%5C%5C%0A)
%5C%5C%0A#card=math&code=z%5E%7B%5Bl%5D%7D%2Ca%5E%7Bl%7D%3A%28n%5E%7B%5Bl%5D%7D%2C1%29%5C%5C%0A)
%0A#card=math&code=Z%5E%7Bl%7D%2CA%5E%7Bl%7D%2CdZ%2CdA%3A%28n%5E%7B%5Bl%5D%7D%2Cm%29%0A)
反向传播公式为:
%5C%5C%0A#card=math&code=dz%5E%7B%5Bl%5D%7D%20%3D%20da%5E%7B%5Bl%5D%7D%20%2A%20g%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%5C%5C%0A)
推导
首先我们知道
%0A#card=math&code=a%5E%7B%5Bl%5D%7D%20%3D%20g%5E%7B%5Bl%5D%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
%20%3D%20-%20yloga-(1-y)log(1-a)%5C%5C%0A#card=math&code=%5Cjmath%28a%2Cy%29%20%3D%20-%20yloga-%281-y%29log%281-a%29%5C%5C%0A)
接下来开始推导过程:
的证明
由公式 %20%3D%20-%20yloga-(1-y)log(1-a)%5C%5C#card=math&code=%5Cjmath%28a%2Cy%29%20%3D%20-%20yloga-%281-y%29log%281-a%29%5C%5C) 对 求导得:
%7D%7Bda%5E%7B%5Bl%5D%7D%7D%20%3D%20-%5Cfrac%7By%7D%7Ba%5E%7B%5Bl%5D%7D%7D%20%2B%20%5Cfrac%7B1-y%7D%7B1-a%5E%7B%5Bl%5D%7D%7D%5C%5C%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bda%5E%7B%5Bl%5D%7D%7D%20%3D%20-%5Cfrac%7By%7D%7Ba%5E%7B%5Bl%5D%7D%7D%20%2B%20%5Cfrac%7B1-y%7D%7B1-a%5E%7B%5Bl%5D%7D%7D%5C%5C%0A)
%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3D%20%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bda%5E%7B%5Bl%5D%7D%7D*%5Cfrac%7Bda%5E%7B%5Bl%5D%7D%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3D%20%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bda%5E%7B%5Bl%5D%7D%7D%2A%5Cfrac%7Bda%5E%7B%5Bl%5D%7D%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%0A)
而
%0A#card=math&code=%5Cfrac%7Bda%5E%7B%5Bl%5D%7D%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3D%20g%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
所以代入公式 中得:
%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D*g%5E%7B%5Bl%5D’%7D(z%5E%7B%5Bl%5D%7D)%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D%2Ag%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
注意:上式子中的 为简写,实际上是:
%7D%7Bda%5E%7B%5Bl%5D%7D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bda%5E%7B%5Bl%5D%7D%7D%0A)
后面的 同理
即证明了公式 %5C%5C#card=math&code=dz%5E%7B%5Bl%5D%7D%20%3D%20da%5E%7B%5Bl%5D%7D%20%2A%20g%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%5C%5C)
的证明
由
%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D*g%5E%7B%5Bl%5D’%7D(z%5E%7B%5Bl%5D%7D)%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D%2Ag%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
的证明结果,我们来推一下
%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20*%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%0A)
因为:
所以:
故:
%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%3Ddz%5E%7B%5Bl%5D%7Da%5E%7B%5Bl-1%5D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdw%5E%7B%5Bl%5D%7D%7D%3Ddz%5E%7B%5Bl%5D%7D%2Aa%5E%7B%5Bl-1%5D%7D%0A)
的证明
由
%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D*g%5E%7B%5Bl%5D’%7D(z%5E%7B%5Bl%5D%7D)%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D%2Ag%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
的证明结果,我们来推一下
%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20*%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%0A)
因为:
所以:
故:
%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20*%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%3Ddz%5E%7B%5Bl%5D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bdb%5E%7B%5Bl%5D%7D%7D%3Ddz%5E%7B%5Bl%5D%7D%0A)
的证明
由
%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D*g%5E%7B%5Bl%5D’%7D(z%5E%7B%5Bl%5D%7D)%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%3Dda%5E%7B%5Bl%5D%7D%2Ag%5E%7B%5Bl%5D%27%7D%28z%5E%7B%5Bl%5D%7D%29%0A)
的证明结果,我们来推一下
%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20*%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%0A)
因为:
所以:
故:
%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath(a%5E%7B%5Bl%5D%7D%2Cy)%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%3Dw%5E%7B%5Bl%5D%5E%7BT%7D%7Ddz%5E%7B%5Bl%5D%7D%0A#card=math&code=%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%20%3D%5Cfrac%7Bd%5Cjmath%28a%5E%7B%5Bl%5D%7D%2Cy%29%7D%7Bdz%5E%7B%5Bl%5D%7D%7D%20%2A%20%5Cfrac%7Bdz%5E%7B%5Bl%5D%7D%7D%7Bda%5E%7B%5Bl-1%5D%7D%7D%3Dw%5E%7B%5Bl%5D%5E%7BT%7D%7D%2Adz%5E%7B%5Bl%5D%7D%0A)
会不会觉得很奇怪?
为什么 是个转置?
%5C%5C%0A#card=math&code=w%5E%7B%5Bl%5D%7D%2C%20dw%20%EF%BC%9A%28n%5E%7B%5Bl%5D%7D%2Cn%5E%7B%5Bl-1%5D%7D%29%5C%5C%0A)
%5C%5C%0A#card=math&code=b%5E%7B%5Bl%5D%7D%2C%20db%20%EF%BC%9A%28n%5E%7B%5Bl%5D%7D%2C1%29%5C%5C%0A)
%5C%5C%0A#card=math&code=z%5E%7B%5Bl%5D%7D%2Ca%5E%7Bl%7D%3A%28n%5E%7B%5Bl%5D%7D%2C1%29%5C%5C%0A)
因为我们还要考虑维度的问题,
维度为 #card=math&code=%28n%5E%7B%5Bl%5D%7D%2Cn%5E%7B%5Bl-1%5D%7D%29)
维度为#card=math&code=%28n%5E%7B%5Bl%5D%7D%2C1%29)
维度为 #card=math&code=%28n%5E%7B%5Bl%5D%7D%2C1%29)
故要使得 与 的积等于 ,我们需要将 转置,转置后的 维度为 #card=math&code=%28n%5E%7B%5Bl%5D%7D%2Cn%5E%7B%5Bl-1%5D%7D%29) ,才可以使得等式成立,且维度一致。