为了解决高斯模型的单峰性的问题,我们引入多个高斯模型的加权平均来拟合多峰数据:
%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Calpha_k%5Cmathcal%7BN%7D(%5Cmu_k%2C%5CSigma_k)%0A#card=math&code=p%28x%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Calpha_k%5Cmathcal%7BN%7D%28%5Cmu_k%2C%5CSigma_k%29%0A#crop=0&crop=0&crop=1&crop=1&id=GMSDA&originHeight=66&originWidth=218&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
引入隐变量 ,这个变量表示对应的样本
属于哪一个高斯分布,这个变量是一个离散的随机变量:
%3Dpi%2C%5Csum%5Climits%7Bi%3D1%7D%5Ekp(z%3Di)%3D1%0A#card=math&code=p%28z%3Di%29%3Dpi%2C%5Csum%5Climits%7Bi%3D1%7D%5Ekp%28z%3Di%29%3D1%0A#crop=0&crop=0&crop=1&crop=1&id=S8D10&originHeight=66&originWidth=271&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

作为一个生成式模型,高斯混合模型通过隐变量 的分布来生成样本。用概率图来表示:
其中,节点 就是上面的概率,
就是生成的高斯分布。于是对
#card=math&code=p%28x%29#crop=0&crop=0&crop=1&crop=1&id=UaXnR&originHeight=26&originWidth=40&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3D%5Csum%5Climitszp(x%2Cz)%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp(x%2Cz%3Dk)%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp(z%3Dk)p(x%7Cz%3Dk)%0A#card=math&code=p%28x%29%3D%5Csum%5Climits_zp%28x%2Cz%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp%28x%2Cz%3Dk%29%3D%5Csum%5Climits_%7Bk%3D1%7D%5EKp%28z%3Dk%29p%28x%7Cz%3Dk%29%0A#crop=0&crop=0&crop=1&crop=1&id=ftfse&originHeight=66&originWidth=547&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
因此:
为了解决高斯模型的单峰性的问题,我们引入多个高斯模型的加权平均来拟合多峰数据:
%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Calpha_k%5Cmathcal%7BN%7D(%5Cmu_k%2C%5CSigma_k)%0A#card=math&code=p%28x%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Calpha_k%5Cmathcal%7BN%7D%28%5Cmu_k%2C%5CSigma_k%29%0A#crop=0&crop=0&crop=1&crop=1&id=R8K0W&originHeight=66&originWidth=218&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
引入隐变量 ,这个变量表示对应的样本
属于哪一个高斯分布,这个变量是一个离散的随机变量:
%3Dpi%2C%5Csum%5Climits%7Bi%3D1%7D%5Ekp(z%3Di)%3D1%0A#card=math&code=p%28z%3Di%29%3Dpi%2C%5Csum%5Climits%7Bi%3D1%7D%5Ekp%28z%3Di%29%3D1%0A#crop=0&crop=0&crop=1&crop=1&id=gvUay&originHeight=66&originWidth=271&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
作为一个生成式模型,高斯混合模型通过隐变量 的分布来生成样本。用概率图来表示:
其中,节点 就是上面的概率,
就是生成的高斯分布。于是对
#card=math&code=p%28x%29#crop=0&crop=0&crop=1&crop=1&id=jQCGN&originHeight=26&originWidth=40&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3D%5Csum%5Climitszp(x%2Cz)%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp(x%2Cz%3Dk)%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp(z%3Dk)p(x%7Cz%3Dk)%0A#card=math&code=p%28x%29%3D%5Csum%5Climits_zp%28x%2Cz%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp%28x%2Cz%3Dk%29%3D%5Csum%5Climits_%7Bk%3D1%7D%5EKp%28z%3Dk%29p%28x%7Cz%3Dk%29%0A#crop=0&crop=0&crop=1&crop=1&id=JIkTq&originHeight=66&originWidth=547&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
因此:
%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%5Cmathcal%7BN%7D(x%7C%5Cmu_k%2C%5CSigma_k)%0A#card=math&code=p%28x%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%5Cmathcal%7BN%7D%28x%7C%5Cmu_k%2C%5CSigma_k%29%0A#crop=0&crop=0&crop=1&crop=1&id=YG8mJ&originHeight=66&originWidth=233&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
极大似然估计
样本为 #card=math&code=X%3D%28x_1%2Cx_2%2C%5Ccdots%2Cx_N%29#crop=0&crop=0&crop=1&crop=1&id=O3jEL&originHeight=26&originWidth=188&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),
#card=math&code=%C2%A0%28X%2CZ%29#crop=0&crop=0&crop=1&crop=1&id=oMBL8&originHeight=26&originWidth=59&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 为完全参数,参数为
。我们通过极大似然估计得到
的值:
%3D%5Cmathop%7Bargmax%7D%7B%5Ctheta%7D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p(xi)%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Ctheta%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Csum%5Climits%7Bk%3D1%7D%5EKpk%5Cmathcal%7BN%7D(x_i%7C%5Cmu_k%2C%5CSigma_k)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Ctheta%7BMLE%7D%26%3D%5Cmathop%7Bargmax%7D%5Climits%7B%5Ctheta%7D%5Clog%20p%28X%29%3D%5Cmathop%7Bargmax%7D%7B%5Ctheta%7D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p%28x_i%29%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Ctheta%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=VsHch&originHeight=134&originWidth=447&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
这个表达式直接通过求导,由于连加号的存在,无法得到解析解。因此需要使用 EM 算法。
EM 求解 GMM
EM 算法的基本表达式为:%5D#card=math&code=%5Ctheta%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%5Climits%7B%5Ctheta%7D%5Cmathbb%7BE%7D%7Bz%7Cx%2C%5Ctheta_t%7D%5Bp%28x%2Cz%7C%5Ctheta%29%5D#crop=0&crop=0&crop=1&crop=1&id=ZEmri&originHeight=44&originWidth=277&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。套用 GMM 的表达式,对数据集来说:
%26%3D%5Csum%5Climitsz%5B%5Clog%5Cprod%5Climits%7Bi%3D1%7D%5ENp(xi%2Cz_i%7C%5Ctheta)%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits_z%5B%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p(xi%2Cz_i%7C%5Ctheta)%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7DQ%28%5Ctheta%2C%5Ctheta%5Et%29%26%3D%5Csum%5Climits_z%5B%5Clog%5Cprod%5Climits%7Bi%3D1%7D%5ENp%28xi%2Cz_i%7C%5Ctheta%29%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits_z%5B%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p%28xi%2Cz_i%7C%5Ctheta%29%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp%28z_i%7Cx_i%2C%5Ctheta%5Et%29%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=uTquo&originHeight=134&originWidth=426&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对于中间的那个求和号,展开,第一项为:
%5Cprod%5Climits%7Bi%3D1%7D%5ENp(z_i%7Cx_i%2C%5Ctheta%5Et)%26%3D%5Csum%5Climits_z%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)p(z_1%7Cx_1%2C%5Ctheta%5Et)%5Cprod%5Climits%7Bi%3D2%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)%0Ap(z_1%7Cx_1%2C%5Ctheta%5Et)%5Csum%5Climits%7Bz2%2C%5Ccdots%2Cz_K%7D%5Cprod%5Climits%7Bi%3D2%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)p(z_1%7Cx_1%2C%5Ctheta%5Et)%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%0A%5Csum%5Climits_z%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29%5Cprod%5Climits%7Bi%3D1%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%26%3D%5Csum%5Climits_z%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29p%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Cprod%5Climits%7Bi%3D2%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29%0Ap%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Csum%5Climits%7Bz2%2C%5Ccdots%2Cz_K%7D%5Cprod%5Climits%7Bi%3D2%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz_1%7D%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29p%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=d1SPo&originHeight=188&originWidth=768&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
类似地, 可以写为:
%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits%7Bzi%7D%5Clog%20p(x_i%2Cz_i%7C%5Ctheta)p(z_i%7Cx_i%2C%5Ctheta%5Et)%0A#card=math&code=Q%28%5Ctheta%2C%5Ctheta%5Et%29%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits_%7Bz_i%7D%5Clog%20p%28x_i%2Cz_i%7C%5Ctheta%29p%28z_i%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=tm55h&originHeight=69&originWidth=378&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对于 #card=math&code=p%28x%2Cz%7C%5Ctheta%29#crop=0&crop=0&crop=1&crop=1&id=Kf0qz&originHeight=26&originWidth=74&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3Dp(z%7C%5Ctheta)p(x%7Cz%2C%5Ctheta)%3Dp_z%5Cmathcal%7BN%7D(x%7C%5Cmu_z%2C%5CSigma_z)%0A#card=math&code=p%28x%2Cz%7C%5Ctheta%29%3Dp%28z%7C%5Ctheta%29p%28x%7Cz%2C%5Ctheta%29%3Dp_z%5Cmathcal%7BN%7D%28x%7C%5Cmu_z%2C%5CSigma_z%29%0A#crop=0&crop=0&crop=1&crop=1&id=w0N3I&originHeight=27&originWidth=385&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对 #card=math&code=p%28z%7Cx%2C%5Ctheta%5Et%29#crop=0&crop=0&crop=1&crop=1&id=WvpQf&originHeight=27&originWidth=82&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3D%5Cfrac%7Bp(x%2Cz%7C%5Ctheta%5Et)%7D%7Bp(x%7C%5Ctheta%5Et)%7D%3D%5Cfrac%7Bp_z%5Et%5Cmathcal%7BN%7D(x%7C%5Cmu_z%5Et%2C%5CSigma_z%5Et)%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D(x%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et)%7D%0A#card=math&code=p%28z%7Cx%2C%5Ctheta%5Et%29%3D%5Cfrac%7Bp%28x%2Cz%7C%5Ctheta%5Et%29%7D%7Bp%28x%7C%5Ctheta%5Et%29%7D%3D%5Cfrac%7Bp_z%5Et%5Cmathcal%7BN%7D%28x%7C%5Cmu_z%5Et%2C%5CSigma_z%5Et%29%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D%28x%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et%29%7D%0A#crop=0&crop=0&crop=1&crop=1&id=FkC5j&originHeight=75&originWidth=391&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
代入 :
%7D%5Cfrac%7Bp%7Bz_i%7D%5Et%5Cmathcal%7BN%7D(x_i%7C%5Cmu%7Bzi%7D%5Et%2C%5CSigma%7Bzi%7D%5Et)%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D(x_i%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et)%7D%0A#card=math&code=Q%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits%7Bz_i%7D%5Clog%20p%7Bzi%7D%5Cmathcal%7BN%28x_i%7C%5Cmu%7Bzi%7D%2C%5CSigma%7Bzi%7D%29%7D%5Cfrac%7Bp%7Bzi%7D%5Et%5Cmathcal%7BN%7D%28x_i%7C%5Cmu%7Bzi%7D%5Et%2C%5CSigma%7Bz_i%7D%5Et%29%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et%29%7D%0A#crop=0&crop=0&crop=1&crop=1&id=KR3Af&originHeight=80&originWidth=467&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
下面需要对 值求最大值:
%5Dp(zi%3Dk%7Cx_i%2C%5Ctheta%5Et)%0A#card=math&code=Q%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits_%7Bi%3D1%7D%5EN%5B%5Clog%20p_k%2B%5Clog%20%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%5Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=cWJEL&originHeight=66&originWidth=479&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
:
%5Dp(zi%3Dk%7Cx_i%2C%5Ctheta%5Et)%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKpk%3D1%0A#card=math&code=p_k%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%7Bpk%7D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5B%5Clog%20p_k%2B%5Clog%20%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%5Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#crop=0&crop=0&crop=1&crop=1&height=60&id=OFyNF&originHeight=66&originWidth=718&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=&width=648)
即:%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#card=math&code=p_k%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%7Bpk%7D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p_kp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#crop=0&crop=0&crop=1&crop=1&id=EeU85&originHeight=66&originWidth=535&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
引入 Lagrange 乘子:%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20pkp(z_i%3Dk%7Cx_i%2C%5Ctheta%5Et)-%5Clambda(1-%5Csum%5Climits%7Bk%3D1%7D%5EKpk)#card=math&code=L%28p_k%2C%5Clambda%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p_kp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29-%5Clambda%281-%5Csum%5Climits%7Bk%3D1%7D%5EKpk%29#crop=0&crop=0&crop=1&crop=1&id=HmJsL&originHeight=66&originWidth=496&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。所以:
%2B%5Clambda%3D0%5C%5C%0A%5CRightarrow%20%5Csum%5Climits_k%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bpk%7Dp(z_i%3Dk%7Cx_i%2C%5Ctheta%5Et)%2B%5Clambda%5Csum%5Climits_kp_k%3D0%5C%5C%0A%5CRightarrow%5Clambda%3D-N%0A#card=math&code=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20p_k%7DL%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bpk%7Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%2B%5Clambda%3D0%5C%5C%0A%5CRightarrow%20%5Csum%5Climits_k%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bp_k%7Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%2B%5Clambda%5Csum%5Climits_kp_k%3D0%5C%5C%0A%5CRightarrow%5Clambda%3D-N%0A#crop=0&crop=0&crop=1&crop=1&id=Xfvjn&originHeight=162&originWidth=900&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
于是有:
%0A#card=math&code=pk%5E%7Bt%2B1%7D%3D%5Cfrac%7B1%7D%7BN%7D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=EZTeq&originHeight=66&originWidth=263&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
,这两个参数是无约束的,直接求导即可。
极大似然估计
样本为 #card=math&code=X%3D%28x_1%2Cx_2%2C%5Ccdots%2Cx_N%29#crop=0&crop=0&crop=1&crop=1&id=EwUiK&originHeight=26&originWidth=188&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),
#card=math&code=%C2%A0%28X%2CZ%29#crop=0&crop=0&crop=1&crop=1&id=ANKyx&originHeight=26&originWidth=59&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 为完全参数,参数为
。我们通过极大似然估计得到
的值:
%3D%5Cmathop%7Bargmax%7D%7B%5Ctheta%7D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p(xi)%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Ctheta%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Csum%5Climits%7Bk%3D1%7D%5EKpk%5Cmathcal%7BN%7D(x_i%7C%5Cmu_k%2C%5CSigma_k)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Ctheta%7BMLE%7D%26%3D%5Cmathop%7Bargmax%7D%5Climits%7B%5Ctheta%7D%5Clog%20p%28X%29%3D%5Cmathop%7Bargmax%7D%7B%5Ctheta%7D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p%28x_i%29%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Ctheta%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=T5WjB&originHeight=134&originWidth=447&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
这个表达式直接通过求导,由于连加号的存在,无法得到解析解。因此需要使用 EM 算法。
EM 求解 GMM
EM 算法的基本表达式为:%5D#card=math&code=%5Ctheta%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%5Climits%7B%5Ctheta%7D%5Cmathbb%7BE%7D%7Bz%7Cx%2C%5Ctheta_t%7D%5Bp%28x%2Cz%7C%5Ctheta%29%5D#crop=0&crop=0&crop=1&crop=1&id=WnbqU&originHeight=44&originWidth=277&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。套用 GMM 的表达式,对数据集来说:
%26%3D%5Csum%5Climitsz%5B%5Clog%5Cprod%5Climits%7Bi%3D1%7D%5ENp(xi%2Cz_i%7C%5Ctheta)%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits_z%5B%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p(xi%2Cz_i%7C%5Ctheta)%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7DQ%28%5Ctheta%2C%5Ctheta%5Et%29%26%3D%5Csum%5Climits_z%5B%5Clog%5Cprod%5Climits%7Bi%3D1%7D%5ENp%28xi%2Cz_i%7C%5Ctheta%29%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits_z%5B%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p%28xi%2Cz_i%7C%5Ctheta%29%5D%5Cprod%20%5Climits%7Bi%3D1%7D%5ENp%28z_i%7Cx_i%2C%5Ctheta%5Et%29%0A%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=whtwT&originHeight=134&originWidth=426&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对于中间的那个求和号,展开,第一项为:
%5Cprod%5Climits%7Bi%3D1%7D%5ENp(z_i%7Cx_i%2C%5Ctheta%5Et)%26%3D%5Csum%5Climits_z%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)p(z_1%7Cx_1%2C%5Ctheta%5Et)%5Cprod%5Climits%7Bi%3D2%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)%0Ap(z_1%7Cx_1%2C%5Ctheta%5Et)%5Csum%5Climits%7Bz2%2C%5Ccdots%2Cz_K%7D%5Cprod%5Climits%7Bi%3D2%7D%5ENp(zi%7Cx_i%2C%5Ctheta%5Et)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p(x_1%2Cz_1%7C%5Ctheta)p(z_1%7Cx_1%2C%5Ctheta%5Et)%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%0A%5Csum%5Climits_z%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29%5Cprod%5Climits%7Bi%3D1%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%26%3D%5Csum%5Climits_z%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29p%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Cprod%5Climits%7Bi%3D2%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz1%7D%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29%0Ap%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Csum%5Climits%7Bz2%2C%5Ccdots%2Cz_K%7D%5Cprod%5Climits%7Bi%3D2%7D%5ENp%28zi%7Cx_i%2C%5Ctheta%5Et%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bz_1%7D%5Clog%20p%28x_1%2Cz_1%7C%5Ctheta%29p%28z_1%7Cx_1%2C%5Ctheta%5Et%29%5Cend%7Balign%7D%0A#crop=0&crop=0&crop=1&crop=1&id=Jruda&originHeight=188&originWidth=768&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
类似地, 可以写为:
%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits%7Bzi%7D%5Clog%20p(x_i%2Cz_i%7C%5Ctheta)p(z_i%7Cx_i%2C%5Ctheta%5Et)%0A#card=math&code=Q%28%5Ctheta%2C%5Ctheta%5Et%29%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits_%7Bz_i%7D%5Clog%20p%28x_i%2Cz_i%7C%5Ctheta%29p%28z_i%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=i97fR&originHeight=69&originWidth=378&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对于 #card=math&code=p%28x%2Cz%7C%5Ctheta%29#crop=0&crop=0&crop=1&crop=1&id=hq0Sc&originHeight=26&originWidth=74&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3Dp(z%7C%5Ctheta)p(x%7Cz%2C%5Ctheta)%3Dp_z%5Cmathcal%7BN%7D(x%7C%5Cmu_z%2C%5CSigma_z)%0A#card=math&code=p%28x%2Cz%7C%5Ctheta%29%3Dp%28z%7C%5Ctheta%29p%28x%7Cz%2C%5Ctheta%29%3Dp_z%5Cmathcal%7BN%7D%28x%7C%5Cmu_z%2C%5CSigma_z%29%0A#crop=0&crop=0&crop=1&crop=1&id=GA6nV&originHeight=27&originWidth=385&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
对 #card=math&code=p%28z%7Cx%2C%5Ctheta%5Et%29#crop=0&crop=0&crop=1&crop=1&id=UbJgf&originHeight=27&originWidth=82&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):
%3D%5Cfrac%7Bp(x%2Cz%7C%5Ctheta%5Et)%7D%7Bp(x%7C%5Ctheta%5Et)%7D%3D%5Cfrac%7Bp_z%5Et%5Cmathcal%7BN%7D(x%7C%5Cmu_z%5Et%2C%5CSigma_z%5Et)%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D(x%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et)%7D%0A#card=math&code=p%28z%7Cx%2C%5Ctheta%5Et%29%3D%5Cfrac%7Bp%28x%2Cz%7C%5Ctheta%5Et%29%7D%7Bp%28x%7C%5Ctheta%5Et%29%7D%3D%5Cfrac%7Bp_z%5Et%5Cmathcal%7BN%7D%28x%7C%5Cmu_z%5Et%2C%5CSigma_z%5Et%29%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D%28x%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et%29%7D%0A#crop=0&crop=0&crop=1&crop=1&id=tW4yP&originHeight=75&originWidth=391&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
代入 :
%7D%5Cfrac%7Bp%7Bz_i%7D%5Et%5Cmathcal%7BN%7D(x_i%7C%5Cmu%7Bzi%7D%5Et%2C%5CSigma%7Bzi%7D%5Et)%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D(x_i%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et)%7D%0A#card=math&code=Q%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Csum%5Climits%7Bz_i%7D%5Clog%20p%7Bzi%7D%5Cmathcal%7BN%28x_i%7C%5Cmu%7Bzi%7D%2C%5CSigma%7Bzi%7D%29%7D%5Cfrac%7Bp%7Bzi%7D%5Et%5Cmathcal%7BN%7D%28x_i%7C%5Cmu%7Bzi%7D%5Et%2C%5CSigma%7Bz_i%7D%5Et%29%7D%7B%5Csum%5Climits_kp_k%5Et%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%5Et%2C%5CSigma_k%5Et%29%7D%0A#crop=0&crop=0&crop=1&crop=1&id=uwBPI&originHeight=80&originWidth=467&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
下面需要对 值求最大值:
%5Dp(zi%3Dk%7Cx_i%2C%5Ctheta%5Et)%0A#card=math&code=Q%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits_%7Bi%3D1%7D%5EN%5B%5Clog%20p_k%2B%5Clog%20%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%5Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=PjVt5&originHeight=66&originWidth=479&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
:
%5Dp(zi%3Dk%7Cx_i%2C%5Ctheta%5Et)%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKpk%3D1%0A#card=math&code=p_k%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%7Bpk%7D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5B%5Clog%20p_k%2B%5Clog%20%5Cmathcal%7BN%7D%28x_i%7C%5Cmu_k%2C%5CSigma_k%29%5Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#crop=0&crop=0&crop=1&crop=1&height=55&id=Fgh89&originHeight=66&originWidth=718&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=&width=596)
即:%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#card=math&code=p_k%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%7Bpk%7D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p_kp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%5C%20s.t.%5C%20%5Csum%5Climits%7Bk%3D1%7D%5EKp_k%3D1%0A#crop=0&crop=0&crop=1&crop=1&id=klTv7&originHeight=66&originWidth=535&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
引入 Lagrange 乘子:%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20pkp(z_i%3Dk%7Cx_i%2C%5Ctheta%5Et)-%5Clambda(1-%5Csum%5Climits%7Bk%3D1%7D%5EKpk)#card=math&code=L%28p_k%2C%5Clambda%29%3D%5Csum%5Climits%7Bk%3D1%7D%5EK%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20p_kp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29-%5Clambda%281-%5Csum%5Climits%7Bk%3D1%7D%5EKpk%29#crop=0&crop=0&crop=1&crop=1&id=mFqSM&originHeight=66&originWidth=496&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。所以:
%2B%5Clambda%3D0%5C%5C%0A%5CRightarrow%20%5Csum%5Climits_k%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bpk%7Dp(z_i%3Dk%7Cx_i%2C%5Ctheta%5Et)%2B%5Clambda%5Csum%5Climits_kp_k%3D0%5C%5C%0A%5CRightarrow%5Clambda%3D-N%0A#card=math&code=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20p_k%7DL%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bpk%7Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%2B%5Clambda%3D0%5C%5C%0A%5CRightarrow%20%5Csum%5Climits_k%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7Bp_k%7Dp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%2B%5Clambda%5Csum%5Climits_kp_k%3D0%5C%5C%0A%5CRightarrow%5Clambda%3D-N%0A#crop=0&crop=0&crop=1&crop=1&id=G8e3F&originHeight=162&originWidth=900&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
于是有:%0A#card=math&code=pk%5E%7Bt%2B1%7D%3D%5Cfrac%7B1%7D%7BN%7D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28z_i%3Dk%7Cx_i%2C%5Ctheta%5Et%29%0A#crop=0&crop=0&crop=1&crop=1&id=vHjNb&originHeight=66&originWidth=263&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
,这两个参数是无约束的,直接求导即可。
