估计误差与逼近误差

一般来说:对任意 误差分解 - 图1

误差分解 - 图2%20-R%5E%7B%7D%3D%5Cleft%5B%20R%5Cleft(%20f%20%5Cright)%20-R%5Cleft(%20f%5E%7B%7D%20%5Cright)%20%5Cright%5D%20%2B%5Cleft%5B%20R%5Cleft(%20f%5E%7B%7D%20%5Cright)%20-R%5E%7B%7D%20%5Cright%5D%0A#card=math&code=R%5Cleft%28%20f%20%5Cright%29%20-R%5E%7B%2A%7D%3D%5Cleft%5B%20R%5Cleft%28%20f%20%5Cright%29%20-R%5Cleft%28%20f%5E%7B%2A%7D%20%5Cright%29%20%5Cright%5D%20%2B%5Cleft%5B%20R%5Cleft%28%20f%5E%7B%2A%7D%20%5Cright%29%20-R%5E%7B%2A%7D%20%5Cright%5D%0A)

其中 误差分解 - 图3误差分解 - 图4 中误差最小的模型,也称为类中最优假设(best-in-class hypothesis)

我们记 误差分解 - 图5 为估计误差(Estimation error),

误差分解 - 图6 为逼近误差(Approximation error),

其中 误差分解 - 图7 的范围越大,逼近误差就越小,但估计误差就越大

误差分解 - 图8

误差分解 - 图9

我们记 误差分解 - 图10#card=math&code=R%28f%29) 为 误差分解 - 图11 的泛化误差,误差分解 - 图12#card=math&code=%5Cwidehat%7BR%7D%28f%29) 为 误差分解 - 图13 的经验误差,

误差分解 - 图14#card=math&code=f_o%3D%5Carg%20%5Cmin_f%20%5Cwidehat%7BR%7D%28f%29) 为使经验误差最小的模型,

误差分解 - 图15#card=math&code=f%5E%7B%2A%7D%3D%5Carg%20%5Cmin_f%20R%28f%29) 为使泛化误差最小的模型,

那么有:

误差分解 - 图16%20-R%5Cleft(%20f%5E%20%5Cright)%20%26%3DR%5Cleft(%20f_o%20%5Cright)%20-%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20%2B%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20-R%5Cleft(%20f%5E%20%5Cright)%5C%5C%0A%09%26%5Cle%20R%5Cleft(%20fo%20%5Cright)%20-%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20%2B%5Cwidehat%7BR%7D%5Cleft(%20f%5E%20%5Cright)%20-R%5Cleft(%20f%5E%20%5Cright)%5C%5C%0A%09%26%5Cle%202%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR(f)-%5Cwidehat%7BR%7D(f)%7C%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09R%5Cleft%28%20fo%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%20%26%3DR%5Cleft%28%20f_o%20%5Cright%29%20-%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20%2B%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%5C%5C%0A%09%26%5Cle%20R%5Cleft%28%20f_o%20%5Cright%29%20-%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20%2B%5Cwidehat%7BR%7D%5Cleft%28%20f%5E%2A%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%5C%5C%0A%09%26%5Cle%202%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR%28f%29-%5Cwidehat%7BR%7D%28f%29%7C%5C%5C%0A%5Cend%7Baligned%7D%0A)

我们知道 误差分解 - 图17-%5Cwidehat%7BR%7D(f)%7C#card=math&code=%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits_%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR%28f%29-%5Cwidehat%7BR%7D%28f%29%7C) 为两种误差的差异上界.

可以看到,当 误差分解 - 图18 越来越大时,差异上界也是在变大的,也就是说估计误差的上界在变大。

误差分解 - 图19#card=math&code=R%28f%5E%2A%29) 会随着 误差分解 - 图20 的增大而变小,因此逼近误差在变小

偏差 - 方差分解

泛化误差的分解

我们假设样本服从 误差分解 - 图21 分布,其概率分布函数为 误差分解 - 图22#card=math&code=p_r%28x%2Cy%29) ,可以写成 误差分解 - 图23%5Csim%20%5Cmathcal%7BD%7D#card=math&code=%28x%2Cy%29%5Csim%20%5Cmathcal%7BD%7D) ,也可以写成 误差分解 - 图24%5Csim%20p_r(x%2Cy)#card=math&code=%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29).

并且 Loss 函数采用平方损失函数,那么 误差分解 - 图25#card=math&code=f%28x%29) 的泛化误差为

误差分解 - 图26%3D%5Cunderset%7B(x%2C%20y)%20%5Csim%20p%7Br%7D(x%2C%20y)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B(y-f(x))%5E%7B2%7D%5Cright%5D%0A#card=math&code=R%28f%29%3D%5Cunderset%7B%28x%2C%20y%29%20%5Csim%20p%7Br%7D%28x%2C%20y%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%28y-f%28x%29%29%5E%7B2%7D%5Cright%5D%0A)

那么使得 误差分解 - 图27#card=math&code=R%28f%29) 最小的最优模型 误差分解 - 图28#card=math&code=f%5E%2A%28x%29) 为

误差分解 - 图29%20%3D%5Cunderset%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7B%5Cmathrm%7Barg%7D%5Cmin%7DR(f)%3D%0A%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A#card=math&code=f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7B%5Cmathrm%7Barg%7D%5Cmin%7DR%28f%29%3D%0A%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A)

我们需要衡量模型 误差分解 - 图30#card=math&code=f%28x%29) 与 真实标记 误差分解 - 图31 之间的误差期望:

误差分解 - 图32%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%2Bf%5E-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%2Bf%5E%2A-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)

以上第2步到第3步之间,需要证明:

误差分解 - 图33%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%3D0%0A#card=math&code=%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%3D0%0A)

首先,我们可以推出:

误差分解 - 图34%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%20%5Cleft(%20f%5E*%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%3D0%0A#card=math&code=%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%3D0%0A)

证明如下: 误差分解 - 图35%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cint_x%7Bf%5E*%5Cleft(%20x%20%5Cright)%20p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Ccdot%20y%7D%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Ccdot%20y%7D%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A%5Cend%7Baligned%7D%0A)

然后由于 误差分解 - 图36误差分解 - 图37 是独立的,所以有:

误差分解 - 图38%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%20%5Cright%5D%20%5Ccdot%20%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20y-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%0A%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%20%5Cright%5D%20%5Ccdot%20%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20y-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%0A%5C%5C%0A%5Cend%7Baligned%7D%0A)

之后只需证二者相等即可:

误差分解 - 图39%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cint_x%7B%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%7D%5C%2C%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cleft(%20%5Cint_y%7By%7D%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E*%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cint_x%7B%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%7D%5C%2C%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cleft%28%20%5Cint_y%7By%7D%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)


回到原公式中来:

误差分解 - 图40%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2BR%5E%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2BR%5E%2A%0A%5Cend%7Baligned%7D%0A)

其中第一项是当前模型和最优模型之间的误差,第二项 误差分解 - 图41 是最优模型和真实数据之间的误差,也就是贝叶斯误差,在 误差分解 - 图42 确定时,它为定值;

也就是说,只有当 误差分解 - 图43 时, 误差分解 - 图44#card=math&code=f%28x%29) 的总期望损失 误差分解 - 图45#card=math&code=R%28f%29) 最小。


最优函数的由来

这时我们回顾上文,会产生一个疑问:

误差分解 - 图46%20%3D%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A#card=math&code=f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A)

这个函数是怎么来的?

我们已知 误差分解 - 图47 是使 误差分解 - 图48%3D%5Cunderset%7B(x%2C%20y)%20%5Csim%20p%7Br%7D(x%2C%20y)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B(y-f(x))%5E%7B2%7D%5Cright%5D#card=math&code=R%28f%29%3D%5Cunderset%7B%28x%2C%20y%29%20%5Csim%20p%7Br%7D%28x%2C%20y%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%28y-f%28x%29%29%5E%7B2%7D%5Cright%5D) 最小的函数,但为什么是这个形式呢?

为什么不能是这种形式呢?

误差分解 - 图49%20%3D%5Cunderset%7By%7D%7B%5Cmathrm%7Barg%7D%5Cmax%7D%5C%2Cp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%0A#card=math&code=%5Chat%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7By%7D%7B%5Cmathrm%7Barg%7D%5Cmax%7D%5C%2Cp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%0A)

所以接下来,我们要对 误差分解 - 图50 的由来进行详细的推导:

我们假定 误差分解 - 图51#card=math&code=f%28x%29) 是使 误差分解 - 图52#card=math&code=R%28f%29) 最小的函数,那么对于

误差分解 - 图53%0A#card=math&code=%5Ctilde%7Bf%7D%3Df%2B%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%0A)

我们设定一个 误差分解 - 图54 函数:

误差分解 - 图55%20%26%3DR(%5Ctilde%7Bf%7D%3B%5Cvarepsilon%20)%5C%5C%0A%09%26%3D%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20(y-f(x)-%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20)%5E2%20%5Cright%5D%5C%5C%0A%09%26%3DR%5Cleft(%20f%20%5Cright)%20%2B%5Cvarepsilon%20%5E2%5Cunderset%7Bx%5Csim%20p_r(x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Ceta%20%5E2%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%2B2%5Cvarepsilon%20%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%20%26%3DR%28%5Ctilde%7Bf%7D%3B%5Cvarepsilon%20%29%5C%5C%0A%09%26%3D%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%28y-f%28x%29-%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%29%5E2%20%5Cright%5D%5C%5C%0A%09%26%3DR%5Cleft%28%20f%20%5Cright%29%20%2B%5Cvarepsilon%20%5E2%5Cunderset%7Bx%5Csim%20p_r%28x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Ceta%20%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%2B2%5Cvarepsilon%20%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)

由于 误差分解 - 图56#card=math&code=f%28x%29) 是使 误差分解 - 图57#card=math&code=R%28f%29) 最小的函数,那么必有

误差分解 - 图58%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%3D0%0A#card=math&code=%5Cleft.%20%5Cfrac%7B%5Cpartial%20%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%3D0%0A)

因此可得:

误差分解 - 图59%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%26%3D2%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Cint_y%7B%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ccdot%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Ceta%7D%5Cleft(%20x%20%5Cright)%20%5Cint_y%7B%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cleft.%20%5Cfrac%7B%5Cpartial%20%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%26%3D2%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Ceta%7D%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A)

误差分解 - 图60#card=math&code=%5Ceta%28x%29) 的任意性可知下式恒成立:

误差分解 - 图61%20-y%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%3D0%0A#card=math&code=%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%3D0%0A)

又有:

误差分解 - 图62%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%26%3Df%5Cleft(%20x%20%5Cright)%20%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3Df%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cint_y%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%26%3Df%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3Df%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%5Cend%7Baligned%7D%0A)

误差分解 - 图63%20%5Cmathrm%7Bd%7Dy%7D%26%3Dp_r%5Cleft(%20x%20%5Cright)%20%5Cint_y%7By%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%26%3Dp_r%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%0A%5Cend%7Baligned%7D%0A)

于是可得:

误差分解 - 图64%20%3D%5Cint_y%7By%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%3D%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A#card=math&code=f%5Cleft%28%20x%20%5Cright%29%20%3D%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%3D%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A)


另一种证明

此外,关于泛化误差的分解,我还想出了另一种证明方法:

误差分解 - 图65%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%5E2%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%2B%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D-2%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%5Ccdot%20f%5Cleft(%20x%20%5Cright)%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cint_y%7Bp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%5Ccdot%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%5Cmathrm%7Bd%7Dx%7D-%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20-2%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E-y%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2BR%5E%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%5E2%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%2B%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D-2%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%5Ccdot%20f%5Cleft%28%20x%20%5Cright%29%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7Bp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%5Ccdot%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%5Cmathrm%7Bd%7Dx%7D-%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20-2%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2BR%5E%2A%5C%5C%0A%5Cend%7Baligned%7D%0A)

可以看到,虽然是从另一种角度,但也得到了同样的结果。

训练模型的误差分解

在实际情况中,模型 误差分解 - 图66#card=math&code=f%28x%29) 都是从某个训练集 误差分解 - 图67 上训练出来的,我们记为 误差分解 - 图68#card=math&code=f_D%28x%29) .

同时,我们设定一个在不同训练集上的期望模型 误差分解 - 图69 .

对于单个样本 误差分解 - 图70误差分解 - 图71#card=math&code=f_D%28x%29) 与 误差分解 - 图72#card=math&code=f%5E%2A%28x%29) 在不同训练集 误差分解 - 图73 上的期望误差为:

误差分解 - 图74%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D-%5Cbar%7Bf%7D%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D-%5Cbar%7Bf%7D%20%5Cright)%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)

第1行到第2行的转化中需要证明:

误差分解 - 图75%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E*%20%5Cright)%20%5Cright%5D%20%3D0%0A#card=math&code=%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%20%3D0%0A)

证明很简单,只需展开即可:

误差分解 - 图76%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cbar%7Bf%7D-f_Df%5E-%5Cbar%7Bf%7D%5E2%2B%5Cbar%7Bf%7Df%5E%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20-%5Cbar%7Bf%7D%5E2%5Cleft(%20x%20%5Cright)%20%2Bf%5E%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5E2%5Cleft(%20x%20%5Cright)%20%2Bf%5E*%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cbar%7Bf%7D-f_Df%5E%2A-%5Cbar%7Bf%7D%5E2%2B%5Cbar%7Bf%7Df%5E%2A%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20-%5Cbar%7Bf%7D%5E2%5Cleft%28%20x%20%5Cright%29%20%2Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5E2%5Cleft%28%20x%20%5Cright%29%20%2Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A)

回到原式中来:

误差分解 - 图77%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%3D%5Cunderset%7B%5Cmathrm%7Bvariance%7D.%5Cmathrm%7Bx%7D%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%7D%7D%2B%5Cunderset%7B%5Cleft(%20%5Cmathrm%7Bbias%7D.%5Cmathrm%7Bx%7D%20%5Cright)%20%5E2%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%7D%7D%0A#card=math&code=%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%3D%5Cunderset%7B%5Cmathrm%7Bvariance%7D.%5Cmathrm%7Bx%7D%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%7D%7D%2B%5Cunderset%7B%5Cleft%28%20%5Cmathrm%7Bbias%7D.%5Cmathrm%7Bx%7D%20%5Cright%29%20%5E2%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%7D%7D%0A)

第一项为方差(Variance),衡量一个模型在不同训练集上的波动,如果方差过大,说明可能过拟合。

第二项为偏差(Bias),衡量一个算法学到模型的平均性能与最优模型之间的差异

误差分解 - 图78

对于固定大小的数据集,方差和偏差之间有一个取舍关系:

  • 模型复杂度越大,拟合能力越强,那么偏差越小,但方差就会越大
  • 模型复杂度越小,方差会减小,但偏差就会变大

例如,当我们给模型加入一个正则化项时,正则化项权重 误差分解 - 图79 越大,学到的模型结构越简单,方差减小,避免过拟合,但由于正则化项的影响,会使偏差变大。

因此,我们需要在偏差和方差之间取得比较好的平衡,使得整体误差最小。

误差分解 - 图80