概览

线性分类 - 图1

线性回归 线性分类 - 图2 线性分类(加上激活函数或者降维)

线性分类 - 图3

感知机

思想:错误驱动

数据:线性分类 - 图4%5Cright%5C%7D%7Bi%3D1%7D%5E%7BN%7D#card=math&code=%5Cleft%5C%7B%5Cleft%28x%7Bi%7D%2C%20y%7Bi%7D%5Cright%29%5Cright%5C%7D%7Bi%3D1%7D%5E%7BN%7D) ,线性分类 - 图5 :{被错误分类的样本}

模型:

线性分类 - 图6%20%26%20%3D%20%5Coperatorname%7Bsign%7D(w%5ETx)%2C%20x%5Cin%5Cmathbb%7BR%7D%5Ep%2Cw%5Cin%5Cmathbb%7BR%7D%5EP%20%5C%5C%0A%09%5Coperatorname%7Bsign%7D(a)%20%26%20%3D%20%5Cleft%5C%7B%5Cbegin%7Barray%7D%7Bll%7D%2B1%2C%20%26%20a%20%5Cgeqslant%200%20%5C%5C%20-1%2C%20%26%20a%3C0%5Cend%7Barray%7D%5Cright.%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%09f%28x%29%20%26%20%3D%20%5Coperatorname%7Bsign%7D%28w%5ETx%29%2C%20x%5Cin%5Cmathbb%7BR%7D%5Ep%2Cw%5Cin%5Cmathbb%7BR%7D%5EP%20%5C%5C%0A%09%5Coperatorname%7Bsign%7D%28a%29%20%26%20%3D%20%5Cleft%5C%7B%5Cbegin%7Barray%7D%7Bll%7D%2B1%2C%20%26%20a%20%5Cgeqslant%200%20%5C%5C%20-1%2C%20%26%20a%3C0%5Cend%7Barray%7D%5Cright.%0A%5Cend%7Bsplit%7D%0A)

策略:Loss function

线性分类 - 图7%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20I%5Cleft%5C%7By%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%3C0%5Cright%5C%7D%0A#card=math&code=L%28%5Comega%29%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20I%5Cleft%5C%7By%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%3C0%5Cright%5C%7D%0A)

对于任意一个样本,如果被正确分类时,应满足:

线性分类 - 图8

但是上面的Loss function不可导,考虑找一个合适的损失函数。

由于 线性分类 - 图9 本身是随 线性分类 - 图10 连续变化的一个函数,可以将其作为损失函数,但仅仅关注于被错误分类的样本,则损失函数变为:

线性分类 - 图11%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20D%7D-y%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%0A#card=math&code=L%28w%29%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20D%7D-y%7Bi%7D%20w%5E%7B%5Ctop%7D%20x%7Bi%7D%0A)

则损失函数的梯度可以计算:

线性分类 - 图12

算法:SGD

线性分类 - 图13%7D%20%5Cleftarrow%20%26%20w%5E%7B(t)%7D-%5Clambda%20%5Cnabla%7Bw%7D%20L%20%5C%5C%20%26%20w%5E%7B(t)%7D%2B%5Clambda%20y%7Bi%7D%20x%7Bi%7D%20%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%20%0AW%5E%7B%28t%2B1%29%7D%20%5Cleftarrow%20%26%20w%5E%7B%28t%29%7D-%5Clambda%20%5Cnabla%7Bw%7D%20L%20%5C%5C%20%26%20w%5E%7B%28t%29%7D%2B%5Clambda%20y%7Bi%7D%20x%7Bi%7D%20%5Cend%7Baligned%7D%0A)

线性判别分析

数据:线性分类 - 图14 其中 线性分类 - 图15 为第一类,线性分类 - 图16 为第二类。

思想:类内小(方差足够小),类间大;(在空间找到一个最佳的一维投影)

定义一些符号:

  1. 线性分类 - 图17线性分类 - 图18 上的投影可以写为 线性分类 - 图19 ,其中 线性分类 - 图20
  2. 所有样本在 线性分类 - 图21 上投影的均值 线性分类 - 图22
  3. 协方差矩阵 线性分类 - 图23(zi-%5Cbar%7Bz%7D)%5ET%3D%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright)%5Cleft(w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright)%5ET#card=math&code=Sz%3D%5Cfrac%7B1%7D%7BN%7D%5Csum%7Bi%3D1%7D%5EN%28zi-%5Cbar%7Bz%7D%29%28z_i-%5Cbar%7Bz%7D%29%5ET%3D%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright%29%5Cleft%28w%5ET%20x%7Bi%7D-%5Cbar%7Bz%7D%5Cright%29%5ET)
  4. 然后可写出类内的 线性分类 - 图24

类内线性分类 - 图25%5E2#card=math&code=%28%5Cbar%7Bz%7D_1-%5Cbar%7Bz%7D_2%29%5E2)

类间线性分类 - 图26

构造损失函数(目标函数):

线性分类 - 图27%3D%5Cfrac%7B%5Cleft(%5Cbar%7Bz%7D%7B1%7D-%5Cbar%7Bz%7D%7B2%7D%5Cright)%5E%7B2%7D%7D%7BS%7B1%7D%2BS%7B2%7D%7D%3D%5Cfrac%7B(w%5E%7B%5Ctop%7D%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright))%5E2%7D%7Bw%5E%7B%5Ctop%7D(S%7Bc_1%7D%2BS%7Bc2%7D)w%7D%0A#card=math&code=J%28w%29%3D%5Cfrac%7B%5Cleft%28%5Cbar%7Bz%7D%7B1%7D-%5Cbar%7Bz%7D%7B2%7D%5Cright%29%5E%7B2%7D%7D%7BS%7B1%7D%2BS%7B2%7D%7D%3D%5Cfrac%7B%28w%5E%7B%5Ctop%7D%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%29%5E2%7D%7Bw%5E%7B%5Ctop%7D%28S%7Bc1%7D%2BS%7Bc_2%7D%29w%7D%0A)

其中:

线性分类 - 图28%5Cleft(w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright)%5E%7B%5Ctop%7D%20%5C%5C%0A%26%20%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20w%5E%7B%5Ctop%7D%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright)%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright)%5E%7B%5Ctop%7D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%5Cleft%5B%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright)%5Cleft(x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright)%5E%7B%5Ctop%7D%5Cright%5D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%5C%5C%0AS_1%2BS_2%20%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%2B%20w%5E%7B%5Ctop%7DS%7Bc2%7Dw%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D(S%7Bc1%7D%2BS%7Bc2%7D)w%20%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0AS%7B1%7D%26%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7B1%7D%7D%5Cleft%28w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright%29%5Cleft%28w%5E%7B%5Ctop%7D%20x%7Bi%7D-%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bj%3D1%7D%5E%7BN%7B1%7D%7D%20w%5E%7B%5Ctop%7D%20x%7Bj%7D%5Cright%29%5E%7B%5Ctop%7D%20%5C%5C%0A%26%20%3D%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20w%5E%7B%5Ctop%7D%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright%29%5E%7B%5Ctop%7D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%5Cleft%5B%5Cfrac%7B1%7D%7BN%7B1%7D%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc%7B1%7D%7D%5Cright%29%5Cleft%28x%7Bi%7D-%5Cbar%7Bx%7D%7Bc1%7D%5Cright%29%5E%7B%5Ctop%7D%5Cright%5D%20w%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%5C%5C%0AS_1%2BS_2%20%26%20%3D%20w%5E%7B%5Ctop%7DS%7Bc1%7Dw%20%2B%20w%5E%7B%5Ctop%7DS%7Bc2%7Dw%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%28S%7Bc1%7D%2BS%7Bc_2%7D%29w%20%0A%5Cend%7Bsplit%7D%0A)

所以:

线性分类 - 图29%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%5Cleft(%5Cbar%7Bx%7D%7Bc_1%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5E%7B%5Ctop%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%5Cleft(S%7Bc%7B1%7D%7D%2BS%7Bc%7B2%7D%7D%5Cright)%20w%7D%20%5C%5C%0A%26%20%3D%20%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w)%5E%7B-1%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0AJ%28w%29%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%5Cleft%28%5Cbar%7Bx%7D%7Bc1%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%5Cleft%28S%7Bc%7B1%7D%7D%2BS%7Bc%7B2%7D%7D%5Cright%29%20w%7D%20%5C%5C%0A%26%20%3D%20%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%20%5C%5C%0A%26%20%3D%20w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%29%5E%7B-1%7D%0A%5Cend%7Bsplit%7D%0A)

其中 线性分类 - 图30 :between-class 类间方差;线性分类 - 图31 with-class 类内方差

损失函数求导:

线性分类 - 图32%7D%7B%5Cpartial%20w%7D%26%3D2%20S%7Bb%7D%20w%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%5E%7B-1%7D%2Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot(-1)%20%5Ccdot%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%5E%7B-2%7D%20%5Ccdot%202%20S%7Bw%7D%20%5Ccdot%20w%20%3D%200%20%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%20J%28w%29%7D%7B%5Cpartial%20w%7D%26%3D2%20S%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29%5E%7B-1%7D%2Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%28-1%29%20%5Ccdot%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29%5E%7B-2%7D%20%5Ccdot%202%20S%7Bw%7D%20%5Ccdot%20w%20%3D%200%20%5C%5C%0A%5Cend%7Bsplit%7D%0A)

两边同时乘上 线性分类 - 图33%5E%7B2%7D#card=math&code=%5Cleft%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%5Cright%29%5E%7B2%7D) 得

线性分类 - 图34-w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20S_w%20%5Ccdot%20w%20%3D0%20%5C%5C%0Aw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%3DS%7Bb%7D%20w%5Cleft(w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright)%0A#card=math&code=S%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%5Cright%29-w%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%20%3D0%20%5C%5C%0Aw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%20%5Ccdot%20Sw%20%5Ccdot%20w%3DS%7Bb%7D%20w%5Cleft%28w%5E%7B%5Ctop%7D%20S_%7Bw%7D%20w%5Cright%29%0A)

其中 线性分类 - 图35线性分类 - 图36 都是标量,则

线性分类 - 图37

这里我们只关注 线性分类 - 图38 的方向,大小无所谓,所以

线性分类 - 图39%5Cleft(%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright)%5E%7B%5Ctop%7D%20w%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0Aw%20%26%20%3D%5Cfrac%7Bw%5E%7B%5Ctop%7D%20S%7Bw%7D%20w%7D%7Bw%5E%7B%5Ctop%7D%20S%7Bb%7D%20w%7D%20S%7Bw%7D%5E%7B-1%7D%20%5Ccdot%20S%7Bb%7D%20%5Ccdot%20w%20%5C%5C%0A%26%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20S%7Bb%7D%20%5Ccdot%20w%20%5C%5C%0A%26%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc_%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w%0A%5Cend%7Bsplit%7D%0A)

其中 线性分类 - 图40%5E%7B%5Ctop%7D%20w#card=math&code=%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc%7B2%7D%7D%5Cright%29%5E%7B%5Ctop%7D%20w) 是一个标量,与方向无关,所以有

线性分类 - 图41%0A#card=math&code=w%20%5Cpropto%20S%7Bw%7D%5E%7B-1%7D%20%5Cleft%28%5Cbar%7Bx%7D%7Bc%7B1%7D%7D-%5Cbar%7Bx%7D%7Bc_%7B2%7D%7D%5Cright%29%0A)

逻辑回归

数据:线性分类 - 图42%20%5Cright%5C%7D%7Bi%3D1%7D%5EN%2Cx_i%5Cin%5Cmathbb%7BR%7D%5Ep%2Cy_i%5Cin%5C%7B0%2C1%5C%7D#card=math&code=%5Cleft%5C%7B%28x_i%2Cy_i%29%20%5Cright%5C%7D%7Bi%3D1%7D%5EN%2Cx_i%5Cin%5Cmathbb%7BR%7D%5Ep%2Cy_i%5Cin%5C%7B0%2C1%5C%7D)

Sigmoid function:

线性分类 - 图43%20%26%20%3D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%20%5C%5C%0A%5Csigma%20%3A%20%26%20%5C%20%5C%20%5Cmathbb%7BR%7D%20%5Clongmapsto%20(0%2C1)%20%5C%5C%0A%3A%20%26%20%5C%20%5C%20w%5E%7B%5Ctop%7Dx%20%5Clongmapsto%20p%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csigma%28z%29%20%26%20%3D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-z%7D%7D%20%5C%5C%0A%5Csigma%20%3A%20%26%20%5C%20%5C%20%5Cmathbb%7BR%7D%20%5Clongmapsto%20%280%2C1%29%20%5C%5C%0A%3A%20%26%20%5C%20%5C%20w%5E%7B%5Ctop%7Dx%20%5Clongmapsto%20p%0A%5Cend%7Bsplit%7D%0A)

线性分类 - 图44 的条件分布为:

线性分类 - 图45%3D%5Csigma%5Cleft(w%5E%7B%5Ctop%7D%20x%5Cright)%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D1%20%5C%5C%20p%7B0%7D%3Dp(y%3D0%20%5Cmid%20x)%3D1-p(y%3D1%20%5Cmid%20x)%3D%5Cfrac%7Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D0%5Cend%7Barray%7D%0A#card=math&code=%5Cbegin%7Barray%7D%7Bl%7Dp%7B1%7D%3Dp%28y%3D1%20%5Cmid%20x%29%3D%5Csigma%5Cleft%28w%5E%7B%5Ctop%7D%20x%5Cright%29%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D1%20%5C%5C%20p_%7B0%7D%3Dp%28y%3D0%20%5Cmid%20x%29%3D1-p%28y%3D1%20%5Cmid%20x%29%3D%5Cfrac%7Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%7B1%2Be%5E%7B-w%5E%7B%5Ctop%7D%20x%7D%7D%2C%20%5Cquad%20y%3D0%5Cend%7Barray%7D%0A)

即也可以简写成(无特殊含义):

线性分类 - 图46%3Dp_1%5Eyp_0%5E%7B1-y%7D%0A#card=math&code=p%28y%7Cx%29%3Dp_1%5Eyp_0%5E%7B1-y%7D%0A)

MLE:

线性分类 - 图47%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft(y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright)%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20P%5Cleft(y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright)%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft(y%7Bi%7D%20%5Clog%20p%7B1%7D%2B%5Cleft(1-y%7Bi%7D%5Cright)%20%5Clog%20p%7B0%7D%5Cright)%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20%5Cpsi%5Cleft(x%7Bi%7D%20%3B%20w%5Cright)%2B%5Cleft(1-y%7Bi%7D%5Cright)%20%5Clog%20%5Cleft(1-%5Cpsi%5Cleft(x%7Bi%7D%3B%20w%5Cright)%5Cright)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Chat%7Bw%7D%20%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%20p%28y%7Cx%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Clog%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft%28y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20P%5Cleft%28y%7Bi%7D%20%5Cmid%20x%7Bi%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%28y%7Bi%7D%20%5Clog%20p%7B1%7D%2B%5Cleft%281-y%7Bi%7D%5Cright%29%20%5Clog%20p%7B0%7D%5Cright%29%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7Bw%7D%20%5C%20%5C%20%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20%5Cpsi%5Cleft%28x%7Bi%7D%20%3B%20w%5Cright%29%2B%5Cleft%281-y%7Bi%7D%5Cright%29%20%5Clog%20%5Cleft%281-%5Cpsi%5Cleft%28x_%7Bi%7D%3B%20w%5Cright%29%5Cright%29%0A%5Cend%7Bsplit%7D%0A)

MLE(max)线性分类 - 图48 Loss function(min Cross Entropy)

高斯判别分析(Gaussian Discriminant Analysis)

不是直接用 线性分类 - 图49#card=math&code=p%28y%5Cmid%20x%29) 计算概率值,而是关注 线性分类 - 图50%20%5Cmathop%7B%3D%7D%5E%7B%3F%7Dp(y%3D1%5Cmid%20x)#card=math&code=p%28y%3D0%5Cmid%20x%29%20%5Cmathop%7B%3D%7D%5E%7B%3F%7Dp%28y%3D1%5Cmid%20x%29) 哪个更大,由贝叶斯定理可知 线性分类 - 图51%20%5Cpropto%20p(x%5Cmid%20y)%5Ccdot%20p(y)#card=math&code=p%28y%5Cmid%20x%29%20%5Cpropto%20p%28x%5Cmid%20y%29%5Ccdot%20p%28y%29) ,即对联合概率进行建模,其中 线性分类 - 图52#card=math&code=p%28y%29) 表示先验, 线性分类 - 图53#card=math&code=p%28x%5Cmid%20y%29) 表示Likelihood,线性分类 - 图54#card=math&code=p%28y%5Cmid%20x%29) 表示后验。

则GDA即为:

线性分类 - 图55%5Ccdot%20p(x%5Cmid%20y)%0A#card=math&code=%5Chat%7By%7D%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D_%7By%7D%5C%20%5C%20p%28y%29%5Ccdot%20p%28x%5Cmid%20y%29%0A)

由于 线性分类 - 图56 则令

线性分类 - 图57%0A%5Cleft%5C%7B%20%5Cbegin%7Barray%7D%0A%09%26%20%5Cphi%20%5Ey%2Cy%3D1%20%5C%5C%0A%09(1-%5Cphi)%5E%7B1-y%7D%2C%20y%3D0%0A%5Cend%7Barray%7D%20%5Cright.%20%5CRightarrow%20%5Cphi%5Ey%5Ccdot%20(1-%5Cphi)%5E%7B1-y%7D%0A%5C%5C%0A%26%20%0A%5Cleft.%20%5Cbegin%7Barray%7D%0A%26%20x%20%5Cmid%20y%3D1%20%20%5Csim%20N(%5Cmu_1%2C%5CSigma)%20%5C%5C%0Ax%20%5Cmid%20y%3D0%20%20%5Csim%20N(%5Cmu_2%2C%20%5CSigma)%0A%5Cend%7Barray%7D%20%5Cright%5C%7D%20%5CRightarrow%20N(%5Cmu_1%2C%5CSigma)%5Ey%5Ccdot%20N(%5Cmu_2%2C%5CSigma)%5E%7B1-y%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0Ay%20%26%20%5Csim%20%5Ctext%20%7B%20Bernoulli%20%7D%28%5Cphi%29%0A%5Cleft%5C%7B%20%5Cbegin%7Barray%7D%0A%09%26%20%5Cphi%20%5Ey%2Cy%3D1%20%5C%5C%0A%09%281-%5Cphi%29%5E%7B1-y%7D%2C%20y%3D0%0A%5Cend%7Barray%7D%20%5Cright.%20%5CRightarrow%20%5Cphi%5Ey%5Ccdot%20%281-%5Cphi%29%5E%7B1-y%7D%0A%5C%5C%0A%26%20%0A%5Cleft.%20%5Cbegin%7Barray%7D%0A%26%20x%20%5Cmid%20y%3D1%20%20%5Csim%20N%28%5Cmu_1%2C%5CSigma%29%20%5C%5C%0Ax%20%5Cmid%20y%3D0%20%20%5Csim%20N%28%5Cmu_2%2C%20%5CSigma%29%0A%5Cend%7Barray%7D%20%5Cright%5C%7D%20%5CRightarrow%20N%28%5Cmu_1%2C%5CSigma%29%5Ey%5Ccdot%20N%28%5Cmu_2%2C%5CSigma%29%5E%7B1-y%7D%0A%5Cend%7Bsplit%7D%0A)

定义log似然函数

线性分类 - 图58%20%26%3D%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft(x%7Bi%7D%2C%20y%7Bi%7D%5Cright)%20%5C%5C%20%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20%5Cleft(P%5Cleft(x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright)%20P%5Cleft(y%7Bi%7D%5Cright)%5Cright)%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20P%5Cleft(x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright)%2B%5Clog%20P%5Cleft(y%7Bi%7D%5Cright)%5Cright%5D%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft(%5Cmu%7Bi%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%20%5Ccdot%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D(1-%5Cphi)%5E%7B1-y%7Bi%7D%7D%5Cright%5D%20%5C%5C%0A%26%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft(%5Cmu%7B1%7D%2C%20%5CSigma%5Cright)%5E%7By%7Bi%7D%7D%2B%5Clog%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%5E%7B1-y%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D(1-%5Cphi)%5E%7B1-y%7Bi%7D%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%20%0AL%28%5Ctheta%29%20%26%3D%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7BN%7D%20P%5Cleft%28x%7Bi%7D%2C%20y%7Bi%7D%5Cright%29%20%5C%5C%20%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%20%5Cleft%28P%5Cleft%28x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright%29%20P%5Cleft%28y%7Bi%7D%5Cright%29%5Cright%29%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20P%5Cleft%28x%7Bi%7D%20%5Cmid%20y%7Bi%7D%5Cright%29%2B%5Clog%20P%5Cleft%28y%7Bi%7D%5Cright%29%5Cright%5D%20%5C%5C%20%0A%26%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft%28%5Cmu%7Bi%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%20%5Ccdot%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D%5Cright%5D%20%5C%5C%0A%26%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Cleft%5B%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D%2B%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7B1-y%7Bi%7D%7D%2B%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A)

然后对于参数 线性分类 - 图59#card=math&code=%5Ctheta%3D%28%5Cmu1%2C%5Cmu_2%EF%BC%8C%5CSigma%2C%5Cphi%29) 进行求解:![](https://g.yuque.com/gr/latex?%5Chat%7B%5Ctheta%7D%3D%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Ctheta%7D%5C%20%5C%20L(%5Ctheta)#card=math&code=%5Chat%7B%5Ctheta%7D%3D%5Cmathop%7B%5Carg%5Cmax%7D_%7B%5Ctheta%7D%5C%20%5C%20L%28%5Ctheta%29)

GDA模型求解

线性分类 - 图60#card=math&code=L%28%5Ctheta%29) 分为三个部分:

  1. 线性分类 - 图61%5E%7By%7Bi%7D%7D#card=math&code=%E2%91%A0%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%5E%7By%7Bi%7D%7D)
  2. 线性分类 - 图62%5E%7B1-y%7Bi%7D%7D#card=math&code=%E2%91%A1%3D%5Csum%7Bi%3D1%7D%5E%7BN%7D%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%5E%7B1-y%7Bi%7D%7D)
  3. 线性分类 - 图63%5E%7B1-y%7Bi%7D%7D#card=math&code=%E2%91%A2%3D%5Csum%7Bi%3D1%7D%5EN%20%5Clog%20%5Cphi%5E%7By%7Bi%7D%7D%281-%5Cphi%29%5E%7B1-y%7Bi%7D%7D)
  • 线性分类 - 图64只考虑线性分类 - 图65%5Clog(1-%5Cphi)#card=math&code=%E2%91%A2%3D%5Csum%7Bi%3D1%7D%5EN%20y%7Bi%7D%20%5Clog%20%5Cphi%20%2B%20%281-y_%7Bi%7D%29%5Clog%281-%5Cphi%29)

线性分类 - 图66%5Cfrac%7B1%7D%7B1-%5Cphi%7D%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20y_i(1-%5Cphi)%20-%20(1-y_i)%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-y_i%5Cphi-%5Cphi%2By_i%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%26%20%3D%20N%5Cphi%20%5C%5C%0A%5Chat%7B%5Cphi%7D%20%26%20%3D%20%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20yi%20%3D%20%5Cfrac%7BN_1%7D%7BN%7D%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%E2%91%A2%7D%7D%7B%5Cpartial%7B%5Cphi%7D%7D%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%5Cfrac%7B1%7D%7B%5Cphi%7D%20-%281-y_i%29%5Cfrac%7B1%7D%7B1-%5Cphi%7D%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%281-%5Cphi%29%20-%20%281-y_i%29%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-y_i%5Cphi-%5Cphi%2By_i%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi-%5Cphi%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%26%20%3D%20N%5Cphi%20%5C%5C%0A%5Chat%7B%5Cphi%7D%20%26%20%3D%20%5Cfrac%7B1%7D%7BN%7D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y_i%20%3D%20%5Cfrac%7BN_1%7D%7BN%7D%5C%5C%0A%5Cend%7Bsplit%7D%0A)

  • 线性分类 - 图67只考虑线性分类 - 图68%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%5Cleft(%5Cfrac%7B1%7D%7B(2%5Cpi)%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D(xi-%5Cmu_1)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu_1)%20%5C%7D%20%5Cright)#card=math&code=%E2%91%A0%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%20%3D%20%5Csum%7Bi%3D1%7D%5E%7BN%7D%20y%7Bi%7D%20%5Clog%5Cleft%28%5Cfrac%7B1%7D%7B%282%5Cpi%29%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu_1%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu_1%29%20%5C%7D%20%5Cright%29)

线性分类 - 图69 只与下式有关:

线性分类 - 图70%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(xi-%5Cmu_1))%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20-%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20yi(x_i%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-2%5Cmu_1%5E%7B%5Ctop%7D%20%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1)%20%5C%5C%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cmu_1%20%20%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20%5Csum%7Bi%3D1%7D%5EN%20yi%28-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu_1%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu_1%29%29%20%5C%5C%0A%26%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7B%5Cmu1%7D%20%5C%20%5C%20-%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20y_i%28x_i%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-2%5Cmu_1%5E%7B%5Ctop%7D%20%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%20%5C%5C%0A%5Cend%7Bsplit%7D%0A)

对于 线性分类 - 图71 而言,第一项 线性分类 - 图72 为常数,只关注后两项。令 :

线性分类 - 图73%3D%5Csum%7Bi%3D1%7D%5ENy_i(%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-%5Cfrac%7B1%7D%7B2%7D%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1)%0A#card=math&code=%5CDelta%3D%20-%5Cfrac%7B1%7D%7B2%7D%5Csum%7Bi%3D1%7D%5EN%20yi%20%28-2%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i%2B%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%3D%5Csum%7Bi%3D1%7D%5ENy_i%28%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7Dx_i-%5Cfrac%7B1%7D%7B2%7D%5Cmu_1%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cmu_1%29%0A)

则:

线性分类 - 图74%20%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20y_i(x_i-%5Cmu_1)%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%5Cmu_1%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7B%5Csum%7Bi%3D1%7D%5EN%20yi%7D%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7BN_1%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%5CDelta%7D%7D%7B%5Cpartial%7B%5Cmu_1%7D%7D%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20%28%5CSigma%5E%7B-1%7Dx_i%20-%5CSigma%5E%7B-1%7D%5Cmu_1%29%20%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%28x_i-%5Cmu_1%29%20%26%20%3D%200%20%5C%5C%0A%5Csum%7Bi%3D1%7D%5EN%20yi%20%5Cmu_1%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20yi%20x_i%7D%7B%5Csum%7Bi%3D1%7D%5EN%20yi%7D%20%5C%5C%0A%5Cmu_1%20%26%20%3D%20%5Cfrac%7B%5Csum%7Bi%3D1%7D%5EN%20y_i%20x_i%7D%7BN_1%7D%0A%5Cend%7Bsplit%7D%0A)

  • 线性分类 - 图75 只考虑 线性分类 - 图76

首先对样本进行分组:

线性分类 - 图77

然后①+②可以写作:

线性分类 - 图78%2B%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B2%7D%7D%20%5Clog%20N%5Cleft(%5Cmu%7B2%7D%2C%20%5CSigma%5Cright)%0A#card=math&code=%E2%91%A0%2B%E2%91%A1%3D%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B1%7D%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B1%7D%2C%20%5CSigma%5Cright%29%2B%5Csum%7Bx%7Bi%7D%20%5Cin%20C%7B2%7D%7D%20%5Clog%20N%5Cleft%28%5Cmu%7B2%7D%2C%20%5CSigma%5Cright%29%0A)

对任意 线性分类 - 图79 的式子进行化简:

线性分类 - 图80%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20%5Clog%5Cleft(%5Cfrac%7B1%7D%7B(2%5Cpi)%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%20%5C%7D%20%5Cright)%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20C-%5Cfrac%7B1%7D%7B2%7D%20%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D(xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%20%5C%5C%0A%26%20%3D%20C%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20(xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csum%7Bi%3D1%7D%5EN%20%5Clog%20N%28%5Cmu%2C%20%5CSigma%29%20%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20%5Clog%5Cleft%28%5Cfrac%7B1%7D%7B%282%5Cpi%29%5E%7B%5Cfrac%7Bp%7D%7B2%7D%7D%7C%5CSigma%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7D%5Cexp%5C%7B-%5Cfrac%7B1%7D%7B2%7D%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%5C%7D%20%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20C-%5Cfrac%7B1%7D%7B2%7D%20%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%5C%5C%0A%26%20%3D%20C%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%0A%5Cend%7Bsplit%7D%0A)

其中最右边一项为标量,可以表示为 线性分类 - 图81 矩阵的迹,根据迹的公式 线性分类 - 图82%3Dtr(CAB)%3Dtr(BCA)#card=math&code=tr%28ABC%29%3Dtr%28CAB%29%3Dtr%28BCA%29) 有:

线性分类 - 图83%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(xi-%5Cmu)%20%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft((xi-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D(x_i-%5Cmu)%5Cright)%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft((xi-%5Cmu)(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cright)%20%5C%5C%0A%26%20%3D%20tr%5Cleft(%5Csum%7Bi%3D1%7D%5EN%20(xi-%5Cmu)(x_i-%5Cmu)%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%20%5Cright)%20%5C%5C%0A%26%20%3D%20tr(NS%5Ccdot%5CSigma%5E%7B-1%7D)%20%5C%5C%0A%26%20%3D%20Ntr(S%5CSigma%5E%7B-1%7D)%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Csum%7Bi%3D1%7D%5EN%20%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%20%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft%28%28xi-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%28x_i-%5Cmu%29%5Cright%29%20%5C%5C%0A%26%20%3D%20%5Csum%7Bi%3D1%7D%5EN%20tr%5Cleft%28%28xi-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20tr%5Cleft%28%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D%5CSigma%5E%7B-1%7D%20%5Cright%29%20%5C%5C%0A%26%20%3D%20tr%28NS%5Ccdot%5CSigma%5E%7B-1%7D%29%20%5C%5C%0A%26%20%3D%20Ntr%28S%5CSigma%5E%7B-1%7D%29%0A%5Cend%7Bsplit%7D%0A)

其中,线性分类 - 图84(xi-%5Cmu)%5E%7B%5Ctop%7D#card=math&code=S%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum%7Bi%3D1%7D%5EN%20%28x_i-%5Cmu%29%28x_i-%5Cmu%29%5E%7B%5Ctop%7D) 为样本协方差矩阵。所以有:

线性分类 - 图85%20%3D%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N%20tr(S%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2BC%0A#card=math&code=%5Csum_%7Bi%3D1%7D%5EN%20%5Clog%20N%28%5Cmu%2C%20%5CSigma%29%20%3D%20-%20%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N%20tr%28S%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2BC%0A)

则原始的目标函数化为:

线性分类 - 图86%20-%5Cfrac%7B1%7D%7B2%7DN_2%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20C%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr(S_1%5Ccdot%20%5CSigma%5E%7B-1%7D)%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20C%20%5C%5C%0A%26%20%3D%20%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Cleft(N%5Clog%7C%5CSigma%7C%20%2B%20N_1%20tr(S_1%5Ccdot%20%5CSigma%5E%7B-1%7D)%20%2B%20N_2%20tr(S_2%5Ccdot%20%5CSigma%5E%7B-1%7D)%5Cright)%20%2BC%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%E2%91%A0%2B%E2%91%A1%20%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN_1%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20-%5Cfrac%7B1%7D%7B2%7DN_2%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20C%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7DN%5Clog%7C%5CSigma%7C%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20-%20%5Cfrac%7B1%7D%7B2%7D%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20C%20%5C%5C%0A%26%20%3D%20%20-%20%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5Clog%7C%5CSigma%7C%20%2B%20N_1%20tr%28S_1%5Ccdot%20%5CSigma%5E%7B-1%7D%29%20%2B%20N_2%20tr%28S_2%5Ccdot%20%5CSigma%5E%7B-1%7D%29%5Cright%29%20%2BC%0A%5Cend%7Bsplit%7D%0A)

接下来对目标函数求导:

线性分类 - 图87%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft(N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%20%5Cright)%20%3D%200%20%5C%5C%20%0A%26%20%5CRightarrow%20N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20N%5CSigma-N_1S_1-N_2S_2%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20%5Chat%7B%5CSigma%7D%20%3D%20%5Cfrac%7BN_1S_1%2BN_2S_2%7D%7BN%7D%0A%5Cend%7Bsplit%7D%0A#card=math&code=%5Cbegin%7Bsplit%7D%0A%5Cfrac%7B%5Cpartial%7B%E2%91%A0%2B%E2%91%A1%7D%7D%7B%5Cpartial%7B%5CSigma%7D%7D%20%26%20%3D%20%09-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5Ccdot%5Cfrac%7B1%7D%7B%7C%5CSigma%7C%7D%5Ccdot%7C%5CSigma%7C%5Ccdot%5CSigma%5E%7B-1%7D%20-%20N_1%5Ccdot%20S_1%5Ccdot%20%5CSigma%5E%7B-2%7D%20-%20N_2%5Ccdot%20S_2%5Ccdot%20%5CSigma%5E%7B-2%7D%5Cright%29%20%5C%5C%0A%26%20%3D%20-%5Cfrac%7B1%7D%7B2%7D%20%5Cleft%28N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%20%5Cright%29%20%3D%200%20%5C%5C%20%0A%26%20%5CRightarrow%20N%5CSigma%5E%7B-1%7D-N_1S_1%5CSigma%5E%7B-2%7D-N_2S_2%5CSigma%5E%7B-2%7D%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20N%5CSigma-N_1S_1-N_2S_2%20%3D%200%20%5C%5C%0A%26%20%5CRightarrow%20%5Chat%7B%5CSigma%7D%20%3D%20%5Cfrac%7BN_1S_1%2BN_2S_2%7D%7BN%7D%0A%5Cend%7Bsplit%7D%0A)

其中利用了两个求导公式:

  1. 线性分类 - 图88%7D%7D%7B%5Cpartial%7BA%7D%7D%3DB%5E%7B%5Ctop%7D#card=math&code=%5Cfrac%7B%5Cpartial%7Btr%28AB%29%7D%7D%7B%5Cpartial%7BA%7D%7D%3DB%5E%7B%5Ctop%7D)
  2. 线性分类 - 图89

朴素贝叶斯分类器(Naive Bayes)

思想:朴素贝叶斯假设(条件独立性假设),是最简单的概率图(有向图)

动机:简化运算

线性分类 - 图90

实际上将问题简化为:

线性分类 - 图91%5Cprod%7Bi%3D1%7D%5Epp(x_i%5Cmid%20y)%0A#card=math&code=%5Chat%7By%7D%20%3D%20%5Cmathop%7B%5Carg%5Cmax%7D%7By%7D%20%5C%20%5C%20p%28y%29%5Cprod_%7Bi%3D1%7D%5Epp%28x_i%5Cmid%20y%29%0A)

其中, 线性分类 - 图92#card=math&code=p%28y%29) 的分布为:

种类 分布
二分类 线性分类 - 图93
多分类 线性分类 - 图94

线性分类 - 图95#card=math&code=p%28x_i%5Cmid%20y%29) 的分布为:

种类 分布
线性分类 - 图96 离散 线性分类 - 图97
线性分类 - 图98 连续 线性分类 - 图99#card=math&code=x_j%5Csim%20N%28%5Cmu_j%2C%20%5Csigma_j%29)

备注:

Bernoulli → Binomial Categorical → Multinomial