18.受限玻尔兹曼机 - 《机器学习》

推断
- #card=math&code=p%28h%7Cv%29&height=18&width=39#crop=0&crop=0&crop=1&crop=1&id=Pv0ki&originHeight=26&originWidth=56&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)"> $18.受限玻尔兹曼机 - 图1$ #card=math&code=p%28h%7Cv%29&height=18&width=39#crop=0&crop=0&crop=1&crop=1&id=Pv0ki&originHeight=26&originWidth=56&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
- #card=math&code=p%28v%29&height=18&width=27#crop=0&crop=0&crop=1&crop=1&id=HsH4q&originHeight=26&originWidth=38&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)"> $18.受限玻尔兹曼机 - 图2$ #card=math&code=p%28v%29&height=18&width=27#crop=0&crop=0&crop=1&crop=1&id=HsH4q&originHeight=26&originWidth=38&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

玻尔兹曼机是一种存在隐节点的无向图模型。在图模型中最简单的是朴素贝叶斯模型（朴素贝叶斯假设），引入单个隐变量后，发展出了 GMM，如果单个隐变量变成序列的隐变量，就得到了状态空间模型（引入齐次马尔可夫假设和观测独立假设就有HMM，Kalman Filter，Particle Filter），为了引入观测变量之间的关联，引入了一种最大熵模型-MEMM，为了克服 MEMM 中的局域问题，又引入了 CRF，CRF 是一个无向图，其中，破坏了齐次马尔可夫假设，如果隐变量是一个链式结构，那么又叫线性链 CRF。

在无向图的基础上，引入隐变量得到了玻尔兹曼机，这个图模型的概率密度函数是一个指数族分布。对隐变量和观测变量作出一定的限制，就得到了受限玻尔兹曼机（RBM）。

我们看到，不同的概率图模型对下面几个特点作出假设：

方向-边的性质
离散/连续/混合-点的性质
条件独立性-边的性质
隐变量-节点的性质
指数族-结构特点

将观测变量和隐变量分别记为 $18.受限玻尔兹曼机 - 图3$ 。我们知道，无向图根据最大团的分解，可以写为玻尔兹曼分布的形式 $18.受限玻尔兹曼机 - 图4$ %3D%5Cfrac%7B1%7D%7BZ%7D%5Cprod%5Climits%7Bi%3D1%7D%5EK%5Cpsi_i(x%7Bci%7D)%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp(-%5Csum%5Climits%7Bi%3D1%7D%5EKE(x%7Bci%7D))#card=math&code=p%28x%29%3D%5Cfrac%7B1%7D%7BZ%7D%5Cprod%5Climits%7Bi%3D1%7D%5EK%5Cpsi_i%28x%7Bci%7D%29%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp%28-%5Csum%5Climits%7Bi%3D1%7D%5EKE%28x%7Bci%7D%29%29&height=47&width=283#crop=0&crop=0&crop=1&crop=1&id=Sy8ov&originHeight=66&originWidth=397&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)，这也是一个指数族分布。

一个玻尔兹曼机存在一系列的问题，在其推断任务中，想要精确推断，是无法进行的，想要近似推断，计算量过大。为了解决这个问题，一种简化的玻尔兹曼机-受限玻尔兹曼机作出了假设，所有隐变量内部以及观测变量内部没有连接，只在隐变量和观测变量之间有连接，这样一来：

$18.受限玻尔兹曼机 - 图5$ %3Dp(h%2Cv)%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp(-E(v%2Ch))%0A#card=math&code=p%28x%29%3Dp%28h%2Cv%29%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp%28-E%28v%2Ch%29%29%0A&height=32&width=218#crop=0&crop=0&crop=1&crop=1&id=hgleI&originHeight=47&originWidth=305&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

其中能量函数 $18.受限玻尔兹曼机 - 图6$ #card=math&code=E%28v%2Ch%29&height=18&width=45#crop=0&crop=0&crop=1&crop=1&id=ZYGcx&originHeight=26&originWidth=64&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 可以写出三个部分，包括与节点集合相关的两项以及与边 $18.受限玻尔兹曼机 - 图7$ 相关的一项，记为：

$18.受限玻尔兹曼机 - 图8$ %3D-(h%5ETwv%2B%5Calpha%5ET%20v%2B%5Cbeta%5ET%20h)%0A#card=math&code=E%28v%2Ch%29%3D-%28h%5ETwv%2B%5Calpha%5ET%20v%2B%5Cbeta%5ET%20h%29%0A&height=20&width=213#crop=0&crop=0&crop=1&crop=1&id=EOw3d&originHeight=29&originWidth=298&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

所以：

$18.受限玻尔兹曼机 - 图9$ %3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp(h%5ETwv)%5Cexp(%5Calpha%5ET%20v)%5Cexp(%5Cbeta%5ET%20h)%3D%5Cfrac%7B1%7D%7BZ%7D%5Cprod%7Bi%3D1%7D%5Em%5Cprod%7Bj%3D1%7D%5En%5Cexp(hiw%7Bij%7Dvj)%5Cprod%7Bj%3D1%7D%5En%5Cexp(%5Calphajv_j)%5Cprod%7Bi%3D1%7D%5Em%5Cexp(%5Cbetaih_i)%0A#card=math&code=p%28x%29%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp%28h%5ETwv%29%5Cexp%28%5Calpha%5ET%20v%29%5Cexp%28%5Cbeta%5ET%20h%29%3D%5Cfrac%7B1%7D%7BZ%7D%5Cprod%7Bi%3D1%7D%5Em%5Cprod%7Bj%3D1%7D%5En%5Cexp%28h_iw%7Bij%7Dvj%29%5Cprod%7Bj%3D1%7D%5En%5Cexp%28%5Calphajv_j%29%5Cprod%7Bi%3D1%7D%5Em%5Cexp%28%5Cbeta_ih_i%29%0A&height=45&width=597#crop=0&crop=0&crop=1&crop=1&id=Ryn8j&originHeight=65&originWidth=835&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

上面这个式子也和 RBM 的因子图一一对应。

推断

推断任务包括求后验概率 18.受限玻尔兹曼机 - 图10 以及求边缘概率 $18.受限玻尔兹曼机 - 图11$ #card=math&code=p%28v%29&height=18&width=27#crop=0&crop=0&crop=1&crop=1&id=YYRO6&originHeight=26&originWidth=38&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。

$18.受限玻尔兹曼机 - 图12$ #card=math&code=p%28h%7Cv%29&height=18&width=39#crop=0&crop=0&crop=1&crop=1&id=Pv0ki&originHeight=26&originWidth=56&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

对于一个无向图，满足局域的 Markov 性质，即 $18.受限玻尔兹曼机 - 图13$ %3Dp(h_1%7CNeighbour(h_1))%3Dp(h_1%7Cv)#card=math&code=p%28h_1%7Ch-%5C%7Bh_1%5C%7D%2Cv%29%3Dp%28h_1%7CNeighbour%28h_1%29%29%3Dp%28h_1%7Cv%29&height=18&width=332#crop=0&crop=0&crop=1&crop=1&id=e2FIf&originHeight=26&originWidth=465&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。我们可以得到：

$18.受限玻尔兹曼机 - 图14$ %3D%5Cprod%7Bi%3D1%7D%5Emp(h_i%7Cv)%0A#card=math&code=p%28h%7Cv%29%3D%5Cprod%7Bi%3D1%7D%5Emp%28h_i%7Cv%29%0A&height=44&width=125#crop=0&crop=0&crop=1&crop=1&id=fkIw1&originHeight=62&originWidth=176&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

考虑 Binary RBM，所有的隐变量只有两个取值 $18.受限玻尔兹曼机 - 图15$ ：

$18.受限玻尔兹曼机 - 图16$ %3D%5Cfrac%7Bp(hl%3D1%2Ch%7B-l%7D%2Cv)%7D%7Bp(h%7B-l%7D%2Cv)%7D%3D%5Cfrac%7Bp(h_l%3D1%2Ch%7B-l%7D%2Cv)%7D%7Bp(hl%3D1%2Ch%7B-l%7D%2Cv)%2Bp(hl%3D0%2Ch%7B-l%7D%2Cv)%7D%0A#card=math&code=p%28hl%3D1%7Cv%29%3D%5Cfrac%7Bp%28h_l%3D1%2Ch%7B-l%7D%2Cv%29%7D%7Bp%28h%7B-l%7D%2Cv%29%7D%3D%5Cfrac%7Bp%28h_l%3D1%2Ch%7B-l%7D%2Cv%29%7D%7Bp%28hl%3D1%2Ch%7B-l%7D%2Cv%29%2Bp%28hl%3D0%2Ch%7B-l%7D%2Cv%29%7D%0A&height=41&width=446#crop=0&crop=0&crop=1&crop=1&id=mgJZh&originHeight=59&originWidth=624&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

将能量函数写成和 $18.受限玻尔兹曼机 - 图17$ 相关或不相关的两项：

$18.受限玻尔兹曼机 - 图18$ %3D-(%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Csum%5Climits%7Bj%3D1%7D%5Enhiw%7Bij%7Dvj%2Bh_l%5Csum%5Climits%7Bj%3D1%7D%5Enw%7Blj%7Dv_j%2B%5Csum%5Climits%7Bj%3D1%7D%5En%5Calphaj%20v_j%2B%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Cbetaih_i%2B%5Cbeta_lh_l)%0A#card=math&code=E%28v%2Ch%29%3D-%28%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Csum%5Climits%7Bj%3D1%7D%5Enh_iw%7Bij%7Dvj%2Bh_l%5Csum%5Climits%7Bj%3D1%7D%5Enw%7Blj%7Dv_j%2B%5Csum%5Climits%7Bj%3D1%7D%5En%5Calphaj%20v_j%2B%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Cbeta_ih_i%2B%5Cbeta_lh_l%29%0A&height=47&width=487#crop=0&crop=0&crop=1&crop=1&id=emMH8&originHeight=66&originWidth=682&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

定义： $18.受限玻尔兹曼机 - 图19$ %3Dhl%5Csum%5Climits%7Bj%3D1%7D%5Enw%7Blj%7Dv_j%2B%5Cbeta_lh_l%2C%5Coverline%7BH%7D(h%7B-l%7D%2Cv)%3D%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Csum%5Climits%7Bj%3D1%7D%5Enhiw%7Bij%7Dvj%2B%5Csum%5Climits%7Bj%3D1%7D%5En%5Calphaj%20v_j%2B%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Cbetaih_i#card=math&code=h_lH_l%28v%29%3Dh_l%5Csum%5Climits%7Bj%3D1%7D%5Enw%7Blj%7Dv_j%2B%5Cbeta_lh_l%2C%5Coverline%7BH%7D%28h%7B-l%7D%2Cv%29%3D%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Csum%5Climits%7Bj%3D1%7D%5Enhiw%7Bij%7Dvj%2B%5Csum%5Climits%7Bj%3D1%7D%5En%5Calphaj%20v_j%2B%5Csum%5Climits%7Bi%3D1%2Ci%5Cne%20l%7D%5Em%5Cbeta_ih_i&height=47&width=537#crop=0&crop=0&crop=1&crop=1&id=LkT7b&originHeight=66&originWidth=752&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。

代入，有：

$18.受限玻尔兹曼机 - 图20$ %3D%5Cfrac%7B%5Cexp(Hl(v)%2B%5Coverline%7BH%7D(h%7B-l%7D%2Cv))%7D%7B%5Cexp(Hl(v)%2B%5Coverline%7BH%7D(h%7B-l%7D%2Cv))%2B%5Cexp(%5Coverline%7BH%7D(h%7B-l%7D%2Cv))%7D%3D%5Cfrac%7B1%7D%7B1%2B%5Cexp(-H_l(v))%7D%3D%5Csigma(H_l(v))%0A#card=math&code=p%28h_l%3D1%7Cv%29%3D%5Cfrac%7B%5Cexp%28H_l%28v%29%2B%5Coverline%7BH%7D%28h%7B-l%7D%2Cv%29%29%7D%7B%5Cexp%28Hl%28v%29%2B%5Coverline%7BH%7D%28h%7B-l%7D%2Cv%29%29%2B%5Cexp%28%5Coverline%7BH%7D%28h_%7B-l%7D%2Cv%29%29%7D%3D%5Cfrac%7B1%7D%7B1%2B%5Cexp%28-H_l%28v%29%29%7D%3D%5Csigma%28H_l%28v%29%29%0A&height=48&width=571#crop=0&crop=0&crop=1&crop=1&id=II7Fi&originHeight=68&originWidth=799&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

于是就得到了后验概率。对于 $18.受限玻尔兹曼机 - 图21$ 的后验是对称的，所以类似的可以求解。

$18.受限玻尔兹曼机 - 图22$ #card=math&code=p%28v%29&height=18&width=27#crop=0&crop=0&crop=1&crop=1&id=HsH4q&originHeight=26&originWidth=38&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

$18.受限玻尔兹曼机 - 图23$ %26%3D%5Csum%5Climitshp(h%2Cv)%3D%5Csum%5Climits_h%5Cfrac%7B1%7D%7BZ%7D%5Cexp(h%5ETwv%2B%5Calpha%5ETv%2B%5Cbeta%5ETh)%5Cnonumber%5C%5C%0A%26%3D%5Cexp(%5Calpha%5ETv)%5Cfrac%7B1%7D%7BZ%7D%5Csum%5Climits%7Bh1%7D%5Cexp(h_1w_1v%2B%5Cbeta_1h_1)%5Ccdots%5Csum%5Climits%7Bhm%7D%5Cexp(h_mw_mv%2B%5Cbeta_mh_m)%5Cnonumber%5C%5C%0A%26%3D%5Cexp(%5Calpha%5ETv)%5Cfrac%7B1%7D%7BZ%7D(1%2B%5Cexp(w_1v%2B%5Cbeta_1))%5Ccdots(1%2B%5Cexp(w_mv%2B%5Cbeta_m))%5Cnonumber%5C%5C%0A%26%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp(%5Calpha%5ETv%2B%5Csum%5Climits%7Bi%3D1%7D%5Em%5Clog(1%2B%5Cexp(wiv%2B%5Cbeta_i)))%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7Dp%28v%29%26%3D%5Csum%5Climits_hp%28h%2Cv%29%3D%5Csum%5Climits_h%5Cfrac%7B1%7D%7BZ%7D%5Cexp%28h%5ETwv%2B%5Calpha%5ETv%2B%5Cbeta%5ETh%29%5Cnonumber%5C%5C%0A%26%3D%5Cexp%28%5Calpha%5ETv%29%5Cfrac%7B1%7D%7BZ%7D%5Csum%5Climits%7Bh1%7D%5Cexp%28h_1w_1v%2B%5Cbeta_1h_1%29%5Ccdots%5Csum%5Climits%7Bhm%7D%5Cexp%28h_mw_mv%2B%5Cbeta_mh_m%29%5Cnonumber%5C%5C%0A%26%3D%5Cexp%28%5Calpha%5ETv%29%5Cfrac%7B1%7D%7BZ%7D%281%2B%5Cexp%28w_1v%2B%5Cbeta_1%29%29%5Ccdots%281%2B%5Cexp%28w_mv%2B%5Cbeta_m%29%29%5Cnonumber%5C%5C%0A%26%3D%5Cfrac%7B1%7D%7BZ%7D%5Cexp%28%5Calpha%5ETv%2B%5Csum%5Climits%7Bi%3D1%7D%5Em%5Clog%281%2B%5Cexp%28w_iv%2B%5Cbeta_i%29%29%29%0A%5Cend%7Balign%7D%0A&height=166&width=464#crop=0&crop=0&crop=1&crop=1&id=T2EwQ&originHeight=233&originWidth=649&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

其中， $18.受限玻尔兹曼机 - 图24$ )#card=math&code=%5Clog%281%2B%5Cexp%28x%29%29&height=18&width=99#crop=0&crop=0&crop=1&crop=1&id=FeLgE&originHeight=26&originWidth=139&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 叫做 Softplus 函数。