我们已经知道概率模型可以分为,频率派的优化问题和贝叶斯派的积分问题。从贝叶斯角度来看推断,对于 这样的新样本,需要得到:
%3D%5Cint%5Ctheta%20p(%5Chat%7Bx%7D%2C%5Ctheta%7CX)d%5Ctheta%3D%5Cint%5Ctheta%20p(%5Ctheta%7CX)p(%5Chat%7Bx%7D%7C%5Ctheta%2CX)d%5Ctheta%0A#card=math&code=p%28%5Chat%7Bx%7D%7CX%29%3D%5Cint%5Ctheta%20p%28%5Chat%7Bx%7D%2C%5Ctheta%7CX%29d%5Ctheta%3D%5Cint%5Ctheta%20p%28%5Ctheta%7CX%29p%28%5Chat%7Bx%7D%7C%5Ctheta%2CX%29d%5Ctheta%0A#crop=0&crop=0&crop=1&crop=1&id=lprlu&originHeight=51&originWidth=432&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
如果新样本和数据集独立,那么推断就是概率分布依参数后验分布的期望。
我们看到,推断问题的中心是参数后验分布的求解,推断分为:
- 精确推断
- 近似推断-参数空间无法精确求解
- 确定性近似-如变分推断
- 随机近似-如 MCMC,MH,Gibbs
基于平均场假设的变分推断
我们记 为隐变量和参数的集合,
为第
维的参数,于是,回顾一下 EM 中的推导:
%3D%5Clog%20p(X%2CZ)-%5Clog%20p(Z%7CX)%3D%5Clog%5Cfrac%7Bp(X%2CZ)%7D%7Bq(Z)%7D-%5Clog%5Cfrac%7Bp(Z%7CX)%7D%7Bq(Z)%7D%0A#card=math&code=%5Clog%20p%28X%29%3D%5Clog%20p%28X%2CZ%29-%5Clog%20p%28Z%7CX%29%3D%5Clog%5Cfrac%7Bp%28X%2CZ%29%7D%7Bq%28Z%29%7D-%5Clog%5Cfrac%7Bp%28Z%7CX%29%7D%7Bq%28Z%29%7D%0A#crop=0&crop=0&crop=1&crop=1&id=CQDqF&originHeight=59&originWidth=587&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
左右两边分别积分:
%5Clog%20p(X)dZ%3D%5Clog%20p(X)%5C%5C%0ARight%3A%5Cint_Z%5B%5Clog%20%5Cfrac%7Bp(X%2CZ)%7D%7Bq(Z)%7D-%5Clog%20%5Cfrac%7Bp(Z%7CX)%7D%7Bq(Z)%7D%5Dq(Z)dZ%3DELBO%2BKL(q%2Cp)%0A#card=math&code=Left%3A%5Cint_Zq%28Z%29%5Clog%20p%28X%29dZ%3D%5Clog%20p%28X%29%5C%5C%0ARight%3A%5Cint_Z%5B%5Clog%20%5Cfrac%7Bp%28X%2CZ%29%7D%7Bq%28Z%29%7D-%5Clog%20%5Cfrac%7Bp%28Z%7CX%29%7D%7Bq%28Z%29%7D%5Dq%28Z%29dZ%3DELBO%2BKL%28q%2Cp%29%0A#crop=0&crop=0&crop=1&crop=1&id=qGToG&originHeight=113&originWidth=900&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
第二个式子可以写为变分和 KL 散度的和:
%2BKL(q%2Cp)%0A#card=math&code=L%28q%29%2BKL%28q%2Cp%29%0A#crop=0&crop=0&crop=1&crop=1&id=G41NU&originHeight=26&originWidth=144&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
由于这个式子是常数,于是寻找 就相当于对
#card=math&code=L%28q%29#crop=0&crop=0&crop=1&crop=1&id=UmvVV&originHeight=26&originWidth=40&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 最大值。
%3D%5Cmathop%7Bargmax%7D%7Bq(Z)%7DL(q)%0A#card=math&code=%5Chat%7Bq%7D%28Z%29%3D%5Cmathop%7Bargmax%7D%7Bq%28Z%29%7DL%28q%29%0A#crop=0&crop=0&crop=1&crop=1&id=Tw3PN&originHeight=45&originWidth=187&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
假设 #card=math&code=q%28Z%29#crop=0&crop=0&crop=1&crop=1&id=THKtV&originHeight=26&originWidth=41&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 可以划分为
个组(平均场近似):
%3D%5Cprod%5Climits%7Bi%3D1%7D%5EMq_i(Z_i)%0A#card=math&code=q%28Z%29%3D%5Cprod%5Climits%7Bi%3D1%7D%5EMq_i%28Z_i%29%0A#crop=0&crop=0&crop=1&crop=1&id=relbJ&originHeight=66&originWidth=153&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
因此,在 %3D%5Cint_Zq(Z)%5Clog%20p(X%2CZ)dZ-%5Cint_Zq(Z)%5Clog%7Bq(Z)%7D#card=math&code=L%28q%29%3D%5Cint_Zq%28Z%29%5Clog%20p%28X%2CZ%29dZ-%5Cint_Zq%28Z%29%5Clog%7Bq%28Z%29%7D#crop=0&crop=0&crop=1&crop=1&id=vdMfi&originHeight=51&originWidth=435&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 中,看
#card=math&code=p%28Z_j%29#crop=0&crop=0&crop=1&crop=1&id=J0Rgo&originHeight=27&originWidth=50&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) ,第一项:
%5Clog%20p(X%2CZ)dZ%26%3D%5CintZ%5Cprod%5Climits%7Bi%3D1%7D%5EMqi(Z_i)%5Clog%20p(X%2CZ)dZ%5Cnonumber%5C%5C%0A%26%3D%5Cint%7BZj%7Dq_j(Z_j)%5Cint%7BZ-Z%7Bj%7D%7D%5Cprod%5Climits%7Bi%5Cne%20j%7Dqi(Z_i)%5Clog%20p(X%2CZ)dZ%5Cnonumber%5C%5C%0A%26%3D%5Cint%7BZj%7Dq_j(Z_j)%5Cmathbb%7BE%7D%7B%5Cprod%5Climits%7Bi%5Cne%20j%7Dq_i(Z_i)%7D%5B%5Clog%20p(X%2CZ)%5DdZ_j%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Cint_Zq%28Z%29%5Clog%20p%28X%2CZ%29dZ%26%3D%5Cint_Z%5Cprod%5Climits%7Bi%3D1%7D%5EMqi%28Z_i%29%5Clog%20p%28X%2CZ%29dZ%5Cnonumber%5C%5C%0A%26%3D%5Cint%7BZj%7Dq_j%28Z_j%29%5Cint%7BZ-Z%7Bj%7D%7D%5Cprod%5Climits%7Bi%5Cne%20j%7Dqi%28Z_i%29%5Clog%20p%28X%2CZ%29dZ%5Cnonumber%5C%5C%0A%26%3D%5Cint%7BZj%7Dq_j%28Z_j%29%5Cmathbb%7BE%7D%7B%5Cprod%5Climits_%7Bi%5Cne%20j%7Dq_i%28Z_i%29%7D%5B%5Clog%20p%28X%2CZ%29%5DdZ_j%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=Wvwa8&originHeight=188&originWidth=595&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
第二项:
%5Clog%20q(Z)dZ%3D%5CintZ%5Cprod%5Climits%7Bi%3D1%7D%5EMqi(Z_i)%5Csum%5Climits%7Bi%3D1%7D%5EM%5Clog%20qi(Z_i)dZ%0A#card=math&code=%5Cint_Zq%28Z%29%5Clog%20q%28Z%29dZ%3D%5Cint_Z%5Cprod%5Climits%7Bi%3D1%7D%5EMqi%28Z_i%29%5Csum%5Climits%7Bi%3D1%7D%5EM%5Clog%20q_i%28Z_i%29dZ%0A#crop=0&crop=0&crop=1&crop=1&id=VMpXZ&originHeight=66&originWidth=458&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
展开求和项第一项为:
%5Clog%20q1(Z_1)dZ%3D%5Cint%7BZ1%7Dq_1(Z_1)%5Clog%20q_1(Z_1)dZ_1%0A#card=math&code=%5Cint_Z%5Cprod%5Climits%7Bi%3D1%7D%5EMqi%28Z_i%29%5Clog%20q_1%28Z_1%29dZ%3D%5Cint%7BZ_1%7Dq_1%28Z_1%29%5Clog%20q_1%28Z_1%29dZ_1%0A#crop=0&crop=0&crop=1&crop=1&id=gUNO5&originHeight=66&originWidth=480&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
所以:
%5Clog%20q(Z)dZ%3D%5Csum%5Climits%7Bi%3D1%7D%5EM%5Cint%7BZi%7Dq_i(Z_i)%5Clog%20q_i(Z_i)dZ_i%3D%5Cint%7BZj%7Dq_j(Z_j)%5Clog%20q_j(Z_j)dZ_j%2BConst%0A#card=math&code=%5Cint_Zq%28Z%29%5Clog%20q%28Z%29dZ%3D%5Csum%5Climits%7Bi%3D1%7D%5EM%5Cint%7BZ_i%7Dq_i%28Z_i%29%5Clog%20q_i%28Z_i%29dZ_i%3D%5Cint%7BZ_j%7Dq_j%28Z_j%29%5Clog%20q_j%28Z_j%29dZ_j%2BConst%0A#crop=0&crop=0&crop=1&crop=1&id=pvudz&originHeight=66&originWidth=762&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
两项相减,令 %7D%5B%5Clog%20p(X%2CZ)%5D%3D%5Clog%20%5Chat%7Bp%7D(X%2CZj)#card=math&code=%5Cmathbb%7BE%7D%7B%5Cprod%5Climits_%7Bi%5Cne%20j%7Dq_i%28Z_i%29%7D%5B%5Clog%20p%28X%2CZ%29%5D%3D%5Clog%20%5Chat%7Bp%7D%28X%2CZ_j%29#crop=0&crop=0&crop=1&crop=1&id=IomsV&originHeight=44&originWidth=325&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 可以得到:
%5Clog%5Cfrac%7Bqj(Z_j)%7D%7B%5Chat%7Bp%7D(X%2CZ_j)%7DdZ_j%5Cle%200%0A#card=math&code=-%5Cint%7BZ_j%7Dq_j%28Z_j%29%5Clog%5Cfrac%7Bq_j%28Z_j%29%7D%7B%5Chat%7Bp%7D%28X%2CZ_j%29%7DdZ_j%5Cle%200%0A#crop=0&crop=0&crop=1&crop=1&id=NsMWz&originHeight=62&originWidth=301&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
于是最大的 %3D%5Chat%7Bp%7D(X%2CZ_j)#card=math&code=q_j%28Z_j%29%3D%5Chat%7Bp%7D%28X%2CZ_j%29#crop=0&crop=0&crop=1&crop=1&id=qJRI1&originHeight=27&originWidth=162&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 才能得到最大值。我们看到,对每一个
,都是固定其余的
,求这个值,于是可以使用坐标上升的方法进行迭代求解,上面的推导针对单个样本,但是对数据集也是适用的。
基于平均场假设的变分推断存在一些问题:
- 假设太强,
非常复杂的情况下,假设不适用
- 期望中的积分,可能无法计算
SGVI
从 到
的过程叫做生成过程或译码,反过来的额过程叫推断过程或编码过程,基于平均场的变分推断可以导出坐标上升的算法,但是这个假设在一些情况下假设太强,同时积分也不一定能算。我们知道,优化方法除了坐标上升,还有梯度上升的方式,我们希望通过梯度上升来得到变分推断的另一种算法。
我们的目标函数:
%3D%5Cmathop%7Bargmax%7D%7Bq(Z)%7DL(q)%0A#card=math&code=%5Chat%7Bq%7D%28Z%29%3D%5Cmathop%7Bargmax%7D%7Bq%28Z%29%7DL%28q%29%0A#crop=0&crop=0&crop=1&crop=1&id=FtnVj&originHeight=45&originWidth=187&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
假定 %3Dq%5Cphi(Z)#card=math&code=q%28Z%29%3Dq%5Cphi%28Z%29#crop=0&crop=0&crop=1&crop=1&id=YHIfo&originHeight=27&originWidth=121&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),是和
这个参数相连的概率分布。于是
%7DL(q)%3D%5Cmathop%7Bargmax%7D%7B%5Cphi%7DL(%5Cphi)#card=math&code=%5Cmathop%7Bargmax%7D%7Bq%28Z%29%7DL%28q%29%3D%5Cmathop%7Bargmax%7D%7B%5Cphi%7DL%28%5Cphi%29#crop=0&crop=0&crop=1&crop=1&id=IWMql&originHeight=45&originWidth=262&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),其中
%3D%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D#card=math&code=L%28%5Cphi%29%3D%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q_%5Cphi%28z%29%5D#crop=0&crop=0&crop=1&crop=1&id=Rns7e&originHeight=33&originWidth=320&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),这里
表示第
个样本。
%26%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%5Cint%20q%5Cphi(z)%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi(z)%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz%2B%5Cint%20q%5Cphi(z)%5Cnabla%5Cphi%20%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi(z)%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz-%5Cint%20q%5Cphi(z)%5Cnabla%5Cphi%20%5Clog%20q%5Cphi(z)dz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi(z)%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz-%5Cint%20%5Cnabla%5Cphi%20q%5Cphi(z)dz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi(z)%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%20q%5Cphi(%5Cnabla%5Cphi%5Clog%20q%5Cphi)(%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z))dz%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B(%5Cnabla%5Cphi%5Clog%20q%5Cphi)(%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z))%5D%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Cnabla%5Cphi%20L%28%5Cphi%29%26%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5D%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%5Cint%20q%5Cphi%28z%29%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi%28z%29%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz%2B%5Cint%20q%5Cphi%28z%29%5Cnabla%5Cphi%20%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi%28z%29%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz-%5Cint%20q%5Cphi%28z%29%5Cnabla%5Cphi%20%5Clog%20q%5Cphi%28z%29dz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi%28z%29%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz-%5Cint%20%5Cnabla%5Cphi%20q%5Cphi%28z%29dz%5Cnonumber%5C%5C%0A%26%3D%5Cint%5Cnabla%5Cphi%20q%5Cphi%28z%29%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Ddz%5Cnonumber%5C%5C%0A%26%3D%5Cint%20q%5Cphi%28%5Cnabla%5Cphi%5Clog%20q%5Cphi%29%28%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%29dz%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%28%5Cnabla%5Cphi%5Clog%20q%5Cphi%29%28%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q_%5Cphi%28z%29%29%5D%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=hJFSz&originHeight=383&originWidth=805&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
这个期望可以通过蒙特卡洛采样来近似,从而得到梯度,然后利用梯度上升的方法来得到参数:
%5C%5C%0A%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B(%5Cnabla%5Cphi%5Clog%20q%5Cphi)(%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z))%5D%5Csim%20%5Cfrac%7B1%7D%7BL%7D%5Csum%5Climits%7Bl%3D1%7D%5EL(%5Cnabla%5Cphi%5Clog%20q%5Cphi)(%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z))%0A#card=math&code=z%5El%5Csim%20q%5Cphi%28z%29%5C%5C%0A%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%28%5Cnabla%5Cphi%5Clog%20q%5Cphi%29%28%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%29%5D%5Csim%20%5Cfrac%7B1%7D%7BL%7D%5Csum%5Climits%7Bl%3D1%7D%5EL%28%5Cnabla%5Cphi%5Clog%20q%5Cphi%29%28%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q_%5Cphi%28z%29%29%0A#crop=0&crop=0&crop=1&crop=1&id=vCMSa&originHeight=101&originWidth=900&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
但是由于求和符号中存在一个对数项,于是直接采样的方差很大,需要采样的样本非常多。为了解决方差太大的问题,我们采用 Reparameterization 的技巧。
考虑:
%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%0A#card=math&code=%5Cnabla%5Cphi%20L%28%5Cphi%29%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q_%5Cphi%28z%29%5D%0A#crop=0&crop=0&crop=1&crop=1&id=aL7Sx&originHeight=33&originWidth=377&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
我们取:%2C%5Cvarepsilon%5Csim%20p(%5Cvarepsilon)#card=math&code=z%3Dg%5Cphi%28%5Cvarepsilon%2Cx%5Ei%29%2C%5Cvarepsilon%5Csim%20p%28%5Cvarepsilon%29#crop=0&crop=0&crop=1&crop=1&id=LoEPc&originHeight=30&originWidth=197&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),于是对后验:#card=math&code=z%5Csim%20q%5Cphi%28z%7Cx%5Ei%29#crop=0&crop=0&crop=1&crop=1&id=pWQ6p&originHeight=30&originWidth=109&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),有 dz%7C%3D%7Cp(%5Cvarepsilon)d%5Cvarepsilon%7C#card=math&code=%7Cq_%5Cphi%28z%7Cx%5Ei%29dz%7C%3D%7Cp%28%5Cvarepsilon%29d%5Cvarepsilon%7C#crop=0&crop=0&crop=1&crop=1&id=h5Tai&originHeight=30&originWidth=200&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。代入上面的梯度中:
%26%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%20L(%5Cphi)%3D%5Cnabla%5Cphi%5Cint%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Dq%5Cphi%20dz%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%5Cint%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5Dp%5Cvarepsilon%20d%5Cvarepsilon%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp(%5Cvarepsilon)%7D%5B%5Cnabla%5Cphi%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp(%5Cvarepsilon)%7D%5B%5Cnabla_z%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%5Cnabla%5Cphi%20z%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp(%5Cvarepsilon)%7D%5B%5Cnabla_z%5B%5Clog%20p%5Ctheta(x%5Ei%2Cz)-%5Clog%20q%5Cphi(z)%5D%5Cnabla%5Cphi%20g%5Cphi(%5Cvarepsilon%2Cx%5Ei)%5D%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%0A%5Cnabla%5Cphi%20L%28%5Cphi%29%26%3D%5Cnabla%5Cphi%5Cmathbb%7BE%7D%7Bq%5Cphi%7D%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5D%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%20L%28%5Cphi%29%3D%5Cnabla%5Cphi%5Cint%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Dq%5Cphi%20dz%5Cnonumber%5C%5C%0A%26%3D%5Cnabla%5Cphi%5Cint%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5Dp%5Cvarepsilon%20d%5Cvarepsilon%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp%28%5Cvarepsilon%29%7D%5B%5Cnabla%5Cphi%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5D%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp%28%5Cvarepsilon%29%7D%5B%5Cnabla_z%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5D%5Cnabla%5Cphi%20z%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathbb%7BE%7D%7Bp%28%5Cvarepsilon%29%7D%5B%5Cnabla_z%5B%5Clog%20p%5Ctheta%28x%5Ei%2Cz%29-%5Clog%20q%5Cphi%28z%29%5D%5Cnabla%5Cphi%20g_%5Cphi%28%5Cvarepsilon%2Cx%5Ei%29%5D%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=m46F8&originHeight=236&originWidth=516&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对这个式子进行蒙特卡洛采样,然后计算期望,得到梯度。
