概览
线性回归 线性分类(加上激活函数或者降维)
感知机
思想:错误驱动
数据:%5Cright%5C%7D%7Bi%3D1%7D%5E%7BN%7D#card=math&code=%5Cleft%5C%7B%5Cleft%28x%7Bi%7D%2C%20y%7Bi%7D%5Cright%29%5Cright%5C%7D%7Bi%3D1%7D%5E%7BN%7D) , :{被错误分类的样本}
模型:
%20%26%20%3D%20%5Coperatorname%7Bsign%7D(w%5ETx)%2C%20x%5Cin%5Cmathbb%7BR%7D%5Ep%2Cw%5Cin%5Cmathbb%7BR%7D%5EP%20%5C%5C%0A%09%5Coperatorname%7Bsign%7D(a)%20%26%20%3D%20%5Cleft%5C%7B%5Cbegin%7Barray%7D%7Bll%7D%2B1%2C%20%26%20a%20%5Cgeqslant%200%20%5C%5C%20-1%2C%20%26%20a%3C0%5Cend%7Barray%7D%5Cright.%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%09f%28x%29%20%26%20%3D%20%5Coperatorname%7Bsign%7D%28w%5ETx%29%2C%20x%5Cin%5Cmathbb%7BR%7D%5Ep%2Cw%5Cin%5Cmathbb%7BR%7D%5EP%20%5C%5C%0A%09%5Coperatorname%7Bsign%7D%28a%29%20%26%20%3D%20%5Cleft%5C%7B%5Cbegin%7Barray%7D%7Bll%7D%2B1%2C%20%26%20a%20%5Cgeqslant%200%20%5C%5C%20-1%2C%20%26%20a%3C0%5Cend%7Barray%7D%5Cright.%0A%5Cend%7Bsplit%7D%0A)
策略:Loss function
%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20I%5Cleft%5C%7By%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%3C0%5Cright%5C%7D%0A#card=math&code=L%28%5Comega%29%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20I%5Cleft%5C%7By%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%3C0%5Cright%5C%7D%0A)
对于任意一个样本,如果被正确分类时,应满足:
但是上面的Loss function不可导,考虑找一个合适的损失函数。
由于 本身是随 连续变化的一个函数,可以将其作为损失函数,但仅仅关注于被错误分类的样本,则损失函数变为:
%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20D%7D-y%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%0A#card=math&code=L%28w%29%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20D%7D-y%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%0A)
则损失函数的梯度可以计算:
算法:SGD
%7D%20%5Cleftarrow%20%26%20w%5E%7B(t)%7D-%5Clambda%20%5Cnabla%7Bw%7D%20L%20%5C%5C%20%26%20w%5E%7B(t)%7D%2B%5Clambda%20y%7Bi%7D%20x%7Bi%7D%20%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%20%0AW%5E%7B%28t%2B1%29%7D%20%5Cleftarrow%20%26%20w%5E%7B%28t%29%7D-%5Clambda%20%5Cnabla%7Bw%7D%20L%20%5C%5C%20%26%20w%5E%7B%28t%29%7D%2B%5Clambda%20y%7Bi%7D%20x%7Bi%7D%20%5Cend%7Baligned%7D%0A)
线性判别分析
数据: 其中 为第一类, 为第二类。
思想:类内小(方差足够小),类间大;(在空间找到一个最佳的一维投影)
定义一些符号:
- 在 上的投影可以写为 ,其中
- 所有样本在 上投影的均值
- 协方差矩阵 (zi-%5Cbar%7Bz%7D)%5ET%3D%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright)%5Cleft(w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright)%5ET#card=math&code=Sz%3D%5Cfrac%7B1%7D%7BN%7D%5Csum%7Bi%3D1%7D%5EN%28zi-%5Cbar%7Bz%7D%29%28z_i-%5Cbar%7Bz%7D%29%5ET%3D%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright%29%5Cleft%28w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright%29%5ET)
- 然后可写出类内的
类内用 %5E2#card=math&code=%28%5Cbar%7Bz%7D_1-%5Cbar%7Bz%7D_2%29%5E2)
类间用
构造损失函数(目标函数):
%3D%5Cfrac%7B%5Cleft(%5Cbar%7Bz%7D%7B1%7D-%5Cbar%7Bz%7D%7B2%7D%5Cright)%5E%7B2%7D%7D%7BS%7B1%7D%2BS%7B2%7D%7D%3D%5Cfrac%7B(w%5E%7B%5Ctop%7D%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright))%5E2%7D%7Bw%5E%7B%5Ctop%7D(S%7Bc_1%7D%2BS%7Bc2%7D)w%7D%0A#card=math&code=J%28w%29%3D%5Cfrac%7B%5Cleft%28%5Cbar%7Bz%7D%7B1%7D-%5Cbar%7Bz%7D%7B2%7D%5Cright%29%5E%7B2%7D%7D%7BS%7B1%7D%2BS%7B2%7D%7D%3D%5Cfrac%7B%28w%5E%7B%5Ctop%7D%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%29%5E2%7D%7Bw%5E%7B%5Ctop%7D%28S%7Bc1%7D%2BS%7Bc_2%7D%29w%7D%0A)
其中:
%5Cleft(w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright)%5E%7B%5Ctop%7D%20%5C%5C%0A%26%20%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20w%5E%7B%5Ctop%7D%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright)%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright)%5E%7B%5Ctop%7D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%5Cleft%5B%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright)%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright)%5E%7B%5Ctop%7D%5Cright%5D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%5C%5C%0AS_1%2BS_2%20%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%2B%20w%5E%7B%5Ctop%7DS%7Bc2%7Dw%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D(S%7Bc1%7D%2BS%7Bc2%7D)w%20%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0AS%7B1%7D%26%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7B1%7D%7D%5Cleft%28w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright%29%5Cleft%28w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright%29%5E%7B%5Ctop%7D%20%5C%5C%0A%26%20%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20w%5E%7B%5Ctop%7D%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright%29%5E%7B%5Ctop%7D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%5Cleft%5B%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright%29%5E%7B%5Ctop%7D%5Cright%5D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%5C%5C%0AS_1%2BS_2%20%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%2B%20w%5E%7B%5Ctop%7DS%7Bc2%7Dw%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%28S%7Bc1%7D%2BS%7Bc_2%7D%29w%20%0A%5Cend%7Bsplit%7D%0A)
所以:
%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%5Cleft(%5Cbar%7Bx%7D%7Bc_1%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5E%7B%5Ctop%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%5Cleft(S%7Bc%7B1%7D%7D%2BS%7Bc%7B2%7D%7D%5Cright)%20w%7D%20%5C%5C%0A%26%20%3D%20%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w)%5E%7B-1%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0AJ%28w%29%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%5Cleft%28%5Cbar%7Bx%7D%7Bc1%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%5Cleft%28S%7Bc%7B1%7D%7D%2BS%7Bc%7B2%7D%7D%5Cright%29%20w%7D%20%5C%5C%0A%26%20%3D%20%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%29%5E%7B-1%7D%0A%5Cend%7Bsplit%7D%0A)
其中 :between-class 类间方差; with-class 类内方差
损失函数求导:
%7D%7B%5Cpartial%20w%7D%26%3D2%20S%7Bb%7D%20w%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%5E%7B-1%7D%2Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot(-1)%20%5Ccdot%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%5E%7B-2%7D%20%5Ccdot%202%20S%7Bw%7D%20%5Ccdot%20w%20%3D%200%20%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%20J%28w%29%7D%7B%5Cpartial%20w%7D%26%3D2%20S%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29%5E%7B-1%7D%2Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%28-1%29%20%5Ccdot%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29%5E%7B-2%7D%20%5Ccdot%202%20S%7Bw%7D%20%5Ccdot%20w%20%3D%200%20%5C%5C%0A%5Cend%7Bsplit%7D%0A)
两边同时乘上 %5E%7B2%7D#card=math&code=%5Cleft%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%5Cright%29%5E%7B2%7D) 得
-w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20S_w%20%5Ccdot%20w%20%3D0%20%5C%5C%0Aw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%3DS%7Bb%7D%20w%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%0A#card=math&code=S%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29-w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%20%3D0%20%5C%5C%0Aw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%3DS%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%5Cright%29%0A)
其中 和 都是标量,则
这里我们只关注 的方向,大小无所谓,所以
%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5E%7B%5Ctop%7D%20w%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0Aw%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%20S%7Bw%7D%5E%7B-1%7D%20%5Ccdot%20S%7Bb%7D%20%5Ccdot%20w%20%5C%5C%0A%26%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20S%7Bb%7D%20%5Ccdot%20w%20%5C%5C%0A%26%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc_%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w%0A%5Cend%7Bsplit%7D%0A)
其中 %5E%7B%5Ctop%7D%20w#card=math&code=%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w) 是一个标量,与方向无关,所以有
%0A#card=math&code=w%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc_%7B2%7D%7D%5Cright%29%0A)
逻辑回归
数据:%20%5Cright%5C%7D%7Bi%3D1%7D%5EN%2Cx_i%5Cin%5Cmathbb%7BR%7D%5Ep%2Cy_i%5Cin%5C%7B0%2C1%5C%7D#card=math&code=%5Cleft%5C%7B%28x_i%2Cy_i%29%20%5Cright%5C%7D%7Bi%3D1%7D%5EN%2Cx_i%5Cin%5Cmathbb%7BR%7D%5Ep%2Cy_i%5Cin%5C%7B0%2C1%5C%7D)
Sigmoid function:
%20%26%20%3D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%20%5C%5C%0A%5Csigma%20%3A%20%26%20%5C%20%5C%20%5Cmathbb%7BR%7D%20%5Clongmapsto%20(0%2C1)%20%5C%5C%0A%3A%20%26%20%5C%20%5C%20w%5E%7B%5Ctop%7Dx%20%5Clongmapsto%20p%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csigma%28z%29%20%26%20%3D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%20%5C%5C%0A%5Csigma%20%3A%20%26%20%5C%20%5C%20%5Cmathbb%7BR%7D%20%5Clongmapsto%20%280%2C1%29%20%5C%5C%0A%3A%20%26%20%5C%20%5C%20w%5E%7B%5Ctop%7Dx%20%5Clongmapsto%20p%0A%5Cend%7Bsplit%7D%0A)
则 的条件分布为:
%3D%5Csigma%5Cleft(w%5E%7B%5Ctop%7D%20x%5Cright)%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D1%20%5C%5C%20p%7B0%7D%3Dp(y%3D0%20%5Cmid%20x)%3D1-p(y%3D1%20%5Cmid%20x)%3D%5Cfrac%7Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D0%5Cend%7Barray%7D%0A#card=math&code=%5Cbegin%7Barray%7D%7Bl%7Dp%7B1%7D%3Dp%28y%3D1%20%5Cmid%20x%29%3D%5Csigma%5Cleft%28w%5E%7B%5Ctop%7D%20x%5Cright%29%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D1%20%5C%5C%20p_%7B0%7D%3Dp%28y%3D0%20%5Cmid%20x%29%3D1-p%28y%3D1%20%5Cmid%20x%29%3D%5Cfrac%7Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D0%5Cend%7Barray%7D%0A)
即也可以简写成(无特殊含义):
%3Dp_1%5Eyp_0%5E%7B1-y%7D%0A#card=math&code=p%28y%7Cx%29%3Dp_1%5Eyp_0%5E%7B1-y%7D%0A)
MLE:
%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft(y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright)%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20P%5Cleft(y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright)%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(y%7Bi%7D%20%5Clog%20p%7B1%7D%2B%5Cleft(1-y%7Bi%7D%5Cright)%20%5Clog%20p%7B0%7D%5Cright)%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20%5Cpsi%5Cleft(x%7Bi%7D%20%3B%20w%5Cright)%2B%5Cleft(1-y%7Bi%7D%5Cright)%20%5Clog%20%5Cleft(1-%5Cpsi%5Cleft(x%7Bi%7D%3B%20w%5Cright)%5Cright)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Chat%7Bw%7D%20%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%20p%28y%7Cx%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft%28y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20P%5Cleft%28y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28y%7Bi%7D%20%5Clog%20p%7B1%7D%2B%5Cleft%281-y%7Bi%7D%5Cright%29%20%5Clog%20p%7B0%7D%5Cright%29%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20%5Cpsi%5Cleft%28x%7Bi%7D%20%3B%20w%5Cright%29%2B%5Cleft%281-y%7Bi%7D%5Cright%29%20%5Clog%20%5Cleft%281-%5Cpsi%5Cleft%28x_%7Bi%7D%3B%20w%5Cright%29%5Cright%29%0A%5Cend%7Bsplit%7D%0A)
MLE(max) Loss function(min Cross Entropy)
高斯判别分析(Gaussian Discriminant Analysis)
不是直接用 #card=math&code=p%28y%5Cmid%20x%29) 计算概率值,而是关注 %20%5Cmathop%7B%3D%7D%5E%7B%3F%7Dp(y%3D1%5Cmid%20x)#card=math&code=p%28y%3D0%5Cmid%20x%29%20%5Cmathop%7B%3D%7D%5E%7B%3F%7Dp%28y%3D1%5Cmid%20x%29) 哪个更大,由贝叶斯定理可知 %20%5Cpropto%20p(x%5Cmid%20y)%5Ccdot%20p(y)#card=math&code=p%28y%5Cmid%20x%29%20%5Cpropto%20p%28x%5Cmid%20y%29%5Ccdot%20p%28y%29) ,即对联合概率进行建模,其中 #card=math&code=p%28y%29) 表示先验, #card=math&code=p%28x%5Cmid%20y%29) 表示Likelihood,#card=math&code=p%28y%5Cmid%20x%29) 表示后验。
则GDA即为:
%5Ccdot%20p(x%5Cmid%20y)%0A#card=math&code=%5Chat%7By%7D%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D_%7By%7D%5C%20%5C%20p%28y%29%5Ccdot%20p%28x%5Cmid%20y%29%0A)
由于 则令
%0A%5Cleft%5C%7B%20%5Cbegin%7Barray%7D%0A%09%26%20%5Cphi%20%5Ey%2Cy%3D1%20%5C%5C%0A%09(1-%5Cphi)%5E%7B1-y%7D%2C%20y%3D0%0A%5Cend%7Barray%7D%20%5Cright.%20%5CRightarrow%20%5Cphi%5Ey%5Ccdot%20(1-%5Cphi)%5E%7B1-y%7D%0A%5C%5C%0A%26%20%0A%5Cleft.%20%5Cbegin%7Barray%7D%0A%26%20x%20%5Cmid%20y%3D1%20%20%5Csim%20N(%5Cmu_1%2C%5CSigma)%20%5C%5C%0Ax%20%5Cmid%20y%3D0%20%20%5Csim%20N(%5Cmu_2%2C%20%5CSigma)%0A%5Cend%7Barray%7D%20%5Cright%5C%7D%20%5CRightarrow%20N(%5Cmu_1%2C%5CSigma)%5Ey%5Ccdot%20N(%5Cmu_2%2C%5CSigma)%5E%7B1-y%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0Ay%20%26%20%5Csim%20%5Ctext%20%7B%20Bernoulli%20%7D%28%5Cphi%29%0A%5Cleft%5C%7B%20%5Cbegin%7Barray%7D%0A%09%26%20%5Cphi%20%5Ey%2Cy%3D1%20%5C%5C%0A%09%281-%5Cphi%29%5E%7B1-y%7D%2C%20y%3D0%0A%5Cend%7Barray%7D%20%5Cright.%20%5CRightarrow%20%5Cphi%5Ey%5Ccdot%20%281-%5Cphi%29%5E%7B1-y%7D%0A%5C%5C%0A%26%20%0A%5Cleft.%20%5Cbegin%7Barray%7D%0A%26%20x%20%5Cmid%20y%3D1%20%20%5Csim%20N%28%5Cmu_1%2C%5CSigma%29%20%5C%5C%0Ax%20%5Cmid%20y%3D0%20%20%5Csim%20N%28%5Cmu_2%2C%20%5CSigma%29%0A%5Cend%7Barray%7D%20%5Cright%5C%7D%20%5CRightarrow%20N%28%5Cmu_1%2C%5CSigma%29%5Ey%5Ccdot%20N%28%5Cmu_2%2C%5CSigma%29%5E%7B1-y%7D%0A%5Cend%7Bsplit%7D%0A)
定义log似然函数
%20%26%3D%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft(x%7Bi%7D%2C%20y%7Bi%7D%5Cright)%20%5C%5C%20%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20%5Cleft(P%5Cleft(x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright)%20P%5Cleft(y%7Bi%7D%5Cright)%5Cright)%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20P%5Cleft(x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright)%2B%5Clog%20P%5Cleft(y%7Bi%7D%5Cright)%5Cright%5D%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft(%5Cmu%7Bi%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%20%5Ccdot%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D(1-%5Cphi)%5E%7B1-y%7Bi%7D%7D%5Cright%5D%20%5C%5C%0A%26%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft(%5Cmu%7B1%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%2B%5Clog%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%5E%7B1-y%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D(1-%5Cphi)%5E%7B1-y%7Bi%7D%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%20%0AL%28%5Ctheta%29%20%26%3D%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft%28x%7Bi%7D%2C%20y%7Bi%7D%5Cright%29%20%5C%5C%20%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20%5Cleft%28P%5Cleft%28x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright%29%20P%5Cleft%28y%7Bi%7D%5Cright%29%5Cright%29%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20P%5Cleft%28x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright%29%2B%5Clog%20P%5Cleft%28y%7Bi%7D%5Cright%29%5Cright%5D%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft%28%5Cmu%7Bi%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%20%5Ccdot%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D%5Cright%5D%20%5C%5C%0A%26%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%2B%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7B1-y%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A)
然后对于参数 #card=math&code=%5Ctheta%3D%28%5Cmu1%2C%5Cmu_2%EF%BC%8C%5CSigma%2C%5Cphi%29) 进行求解:![](https://g.yuque.com/gr/latex?%5Chat%7B%5Ctheta%7D%3D%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Ctheta%7D%5C%20%5C%20L(%5Ctheta)#card=math&code=%5Chat%7B%5Ctheta%7D%3D%5Cmathop%7B%5Carg%5Cmax%7D_%7B%5Ctheta%7D%5C%20%5C%20L%28%5Ctheta%29)
GDA模型求解
将 #card=math&code=L%28%5Ctheta%29) 分为三个部分:
- %5E%7By%7Bi%7D%7D#card=math&code=%E2%91%A0%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D)
- %5E%7B1-y%7Bi%7D%7D#card=math&code=%E2%91%A1%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7B1-y%7Bi%7D%7D)
- %5E%7B1-y%7Bi%7D%7D#card=math&code=%E2%91%A2%3D%5Csum%7Bi%3D1%7D%5EN%20%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D)
- 求 :只考虑%5Clog(1-%5Cphi)#card=math&code=%E2%91%A2%3D%5Csum%7Bi%3D1%7D%5EN%20y%7Bi%7D%20%5Clog%20%5Cphi%20%2B%20%281-y_%7Bi%7D%29%5Clog%281-%5Cphi%29)
%5Cfrac%7B1%7D%7B1-%5Cphi%7D%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20y_i(1-%5Cphi)%20-%20(1-y_i)%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-y_i%5Cphi-%5Cphi%2By_i%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%26%20%3D%20N%5Cphi%20%5C%5C%0A%5Chat%7B%5Cphi%7D%20%26%20%3D%20%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20yi%20%3D%20%5Cfrac%7BN_1%7D%7BN%7D%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%E2%91%A2%7D%7D%7B%5Cpartial%7B%5Cphi%7D%7D%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%5Cfrac%7B1%7D%7B%5Cphi%7D%20-%281-y_i%29%5Cfrac%7B1%7D%7B1-%5Cphi%7D%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%281-%5Cphi%29%20-%20%281-y_i%29%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-y_i%5Cphi-%5Cphi%2By_i%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%26%20%3D%20N%5Cphi%20%5C%5C%0A%5Chat%7B%5Cphi%7D%20%26%20%3D%20%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y_i%20%3D%20%5Cfrac%7BN_1%7D%7BN%7D%5C%5C%0A%5Cend%7Bsplit%7D%0A)
- 求 :只考虑%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%5Cleft(%5Cfrac%7B1%7D%7B(2%5Cpi)%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D(xi-%5Cmu_1)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu_1)%20%5C%7D%20%5Cright)#card=math&code=%E2%91%A0%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%5Cleft%28%5Cfrac%7B1%7D%7B%282%5Cpi%29%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu_1%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu_1%29%20%5C%7D%20%5Cright%29)
则 只与下式有关:
%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(xi-%5Cmu_1))%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20-%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20yi(x_i%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-2%5Cmu_1%5E%7B%5Ctop%7D%20%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1)%20%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cmu_1%20%20%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5EN%20yi%28-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu_1%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu_1%29%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20-%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20y_i%28x_i%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-2%5Cmu_1%5E%7B%5Ctop%7D%20%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%20%5C%5C%0A%5Cend%7Bsplit%7D%0A)
对于 而言,第一项 为常数,只关注后两项。令 :
%3D%5Csum%7Bi%3D1%7D%5ENy_i(%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-%5Cfrac%7B1%7D%7B2%7D%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1)%0A#card=math&code=%5CDelta%3D%20-%5Cfrac%7B1%7D%7B2%7D%5Csum%7Bi%3D1%7D%5EN%20yi%20%28-2%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%3D%5Csum%7Bi%3D1%7D%5ENy_i%28%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-%5Cfrac%7B1%7D%7B2%7D%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%0A)
则:
%20%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20y_i(x_i-%5Cmu_1)%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%5Cmu_1%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7B%5Csum%7Bi%3D1%7D%5EN%20yi%7D%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7BN_1%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%5CDelta%7D%7D%7B%5Cpartial%7B%5Cmu_1%7D%7D%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20%28%5CSigma%5E%7B-1%7Dx_i%20-%5CSigma%5E%7B-1%7D%5Cmu_1%29%20%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%28x_i-%5Cmu_1%29%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%5Cmu_1%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7B%5Csum%7Bi%3D1%7D%5EN%20yi%7D%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20y_i%20x_i%7D%7BN_1%7D%0A%5Cend%7Bsplit%7D%0A)
- 求 : 只考虑
首先对样本进行分组:
然后①+②可以写作:
%2B%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B2%7D%7D%20%5Clog%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%0A#card=math&code=%E2%91%A0%2B%E2%91%A1%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B1%7D%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%2B%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B2%7D%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%0A)
对任意 的式子进行化简:
%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20%5Clog%5Cleft(%5Cfrac%7B1%7D%7B(2%5Cpi)%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%20%5C%7D%20%5Cright)%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20C-%5Cfrac%7B1%7D%7B2%7D%20%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D(xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%20%5C%5C%0A%26%20%3D%20C%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20(xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csum%7Bi%3D1%7D%5EN%20%5Clog%20N%28%5Cmu%2C%20%5CSigma%29%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20%5Clog%5Cleft%28%5Cfrac%7B1%7D%7B%282%5Cpi%29%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%5C%7D%20%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20C-%5Cfrac%7B1%7D%7B2%7D%20%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%5C%5C%0A%26%20%3D%20C%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%0A%5Cend%7Bsplit%7D%0A)
其中最右边一项为标量,可以表示为 矩阵的迹,根据迹的公式 %3Dtr(CAB)%3Dtr(BCA)#card=math&code=tr%28ABC%29%3Dtr%28CAB%29%3Dtr%28BCA%29) 有:
%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(xi-%5Cmu)%20%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft((xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%5Cright)%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft((xi-%5Cmu)(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cright)%20%5C%5C%0A%26%20%3D%20tr%5Cleft(%5Csum%7Bi%3D1%7D%5EN%20(xi-%5Cmu)(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%20%5Cright)%20%5C%5C%0A%26%20%3D%20tr(NS%5Ccdot%5CSigma%5E%7B-1%7D)%20%5C%5C%0A%26%20%3D%20Ntr(S%5CSigma%5E%7B-1%7D)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csum%7Bi%3D1%7D%5EN%20%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft%28%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft%28%28xi-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20tr%5Cleft%28%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%20%5Cright%29%20%5C%5C%0A%26%20%3D%20tr%28NS%5Ccdot%5CSigma%5E%7B-1%7D%29%20%5C%5C%0A%26%20%3D%20Ntr%28S%5CSigma%5E%7B-1%7D%29%0A%5Cend%7Bsplit%7D%0A)
其中,(xi-%5Cmu)%5E%7B%5Ctop%7D#card=math&code=S%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D) 为样本协方差矩阵。所以有:
%20%3D%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N%20tr(S%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2BC%0A#card=math&code=%5Csum_%7Bi%3D1%7D%5EN%20%5Clog%20N%28%5Cmu%2C%20%5CSigma%29%20%3D%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N%20tr%28S%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2BC%0A)
则原始的目标函数化为:
%20-%5Cfrac%7B1%7D%7B2%7DN_2%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20C%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr(S_1%5Ccdot%20%5CSigma%5E%7B-1%7D)%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20C%20%5C%5C%0A%26%20%3D%20%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Cleft(N%5Clog%7C%5CSigma%7C%20%2B%20N_1%20tr(S_1%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%5Cright)%20%2BC%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%E2%91%A0%2B%E2%91%A1%20%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN_1%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20-%5Cfrac%7B1%7D%7B2%7DN_2%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20C%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20C%20%5C%5C%0A%26%20%3D%20%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5Clog%7C%5CSigma%7C%20%2B%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%5Cright%29%20%2BC%0A%5Cend%7Bsplit%7D%0A)
接下来对目标函数求导:
%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft(N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%20%5Cright)%20%3D%200%20%5C%5C%20%0A%26%20%5CRightarrow%20N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20N%5CSigma-N_1S_1-N_2S_2%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20%5Chat%7B%5CSigma%7D%20%3D%20%5Cfrac%7BN_1S_1%2BN_2S_2%7D%7BN%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%E2%91%A0%2B%E2%91%A1%7D%7D%7B%5Cpartial%7B%5CSigma%7D%7D%20%26%20%3D%20%09-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5Ccdot%5Cfrac%7B1%7D%7B%7C%5CSigma%7C%7D%5Ccdot%7C%5CSigma%7C%5Ccdot%5CSigma%5E%7B-1%7D%20-%20N_1%5Ccdot%20S_1%5Ccdot%20%5CSigma%5E%7B-2%7D%20-%20N_2%5Ccdot%20S_2%5Ccdot%20%5CSigma%5E%7B-2%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%20%5Cright%29%20%3D%200%20%5C%5C%20%0A%26%20%5CRightarrow%20N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20N%5CSigma-N_1S_1-N_2S_2%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20%5Chat%7B%5CSigma%7D%20%3D%20%5Cfrac%7BN_1S_1%2BN_2S_2%7D%7BN%7D%0A%5Cend%7Bsplit%7D%0A)
其中利用了两个求导公式:
- %7D%7D%7B%5Cpartial%7BA%7D%7D%3DB%5E%7B%5Ctop%7D#card=math&code=%5Cfrac%7B%5Cpartial%7Btr%28AB%29%7D%7D%7B%5Cpartial%7BA%7D%7D%3DB%5E%7B%5Ctop%7D)
朴素贝叶斯分类器(Naive Bayes)
思想:朴素贝叶斯假设(条件独立性假设),是最简单的概率图(有向图)
动机:简化运算
实际上将问题简化为:
%5Cprod%7Bi%3D1%7D%5Epp(x_i%5Cmid%20y)%0A#card=math&code=%5Chat%7By%7D%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7By%7D%20%5C%20%5C%20p%28y%29%5Cprod_%7Bi%3D1%7D%5Epp%28x_i%5Cmid%20y%29%0A)
其中, #card=math&code=p%28y%29) 的分布为:
种类 | 分布 |
---|---|
二分类 | |
多分类 |
#card=math&code=p%28x_i%5Cmid%20y%29) 的分布为:
种类 | 分布 |
---|---|
离散 | |
连续 | #card=math&code=x_j%5Csim%20N%28%5Cmu_j%2C%20%5Csigma_j%29) |
备注:
Bernoulli → Binomial Categorical → Multinomial