估计误差与逼近误差
一般来说:对任意 ,
%20-R%5E%7B%7D%3D%5Cleft%5B%20R%5Cleft(%20f%20%5Cright)%20-R%5Cleft(%20f%5E%7B%7D%20%5Cright)%20%5Cright%5D%20%2B%5Cleft%5B%20R%5Cleft(%20f%5E%7B%7D%20%5Cright)%20-R%5E%7B%7D%20%5Cright%5D%0A#card=math&code=R%5Cleft%28%20f%20%5Cright%29%20-R%5E%7B%2A%7D%3D%5Cleft%5B%20R%5Cleft%28%20f%20%5Cright%29%20-R%5Cleft%28%20f%5E%7B%2A%7D%20%5Cright%29%20%5Cright%5D%20%2B%5Cleft%5B%20R%5Cleft%28%20f%5E%7B%2A%7D%20%5Cright%29%20-R%5E%7B%2A%7D%20%5Cright%5D%0A)
其中 为
中误差最小的模型,也称为类中最优假设(best-in-class hypothesis)
我们记 为估计误差(Estimation error),
记 为逼近误差(Approximation error),
其中 的范围越大,逼近误差就越小,但估计误差就越大

我们记 #card=math&code=R%28f%29) 为
的泛化误差,
#card=math&code=%5Cwidehat%7BR%7D%28f%29) 为
的经验误差,
记 #card=math&code=f_o%3D%5Carg%20%5Cmin_f%20%5Cwidehat%7BR%7D%28f%29) 为使经验误差最小的模型,
记 #card=math&code=f%5E%7B%2A%7D%3D%5Carg%20%5Cmin_f%20R%28f%29) 为使泛化误差最小的模型,
那么有:
%20-R%5Cleft(%20f%5E%20%5Cright)%20%26%3DR%5Cleft(%20f_o%20%5Cright)%20-%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20%2B%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20-R%5Cleft(%20f%5E%20%5Cright)%5C%5C%0A%09%26%5Cle%20R%5Cleft(%20fo%20%5Cright)%20-%5Cwidehat%7BR%7D%5Cleft(%20f_o%20%5Cright)%20%2B%5Cwidehat%7BR%7D%5Cleft(%20f%5E%20%5Cright)%20-R%5Cleft(%20f%5E%20%5Cright)%5C%5C%0A%09%26%5Cle%202%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR(f)-%5Cwidehat%7BR%7D(f)%7C%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09R%5Cleft%28%20fo%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%20%26%3DR%5Cleft%28%20f_o%20%5Cright%29%20-%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20%2B%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%5C%5C%0A%09%26%5Cle%20R%5Cleft%28%20f_o%20%5Cright%29%20-%5Cwidehat%7BR%7D%5Cleft%28%20f_o%20%5Cright%29%20%2B%5Cwidehat%7BR%7D%5Cleft%28%20f%5E%2A%20%5Cright%29%20-R%5Cleft%28%20f%5E%2A%20%5Cright%29%5C%5C%0A%09%26%5Cle%202%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR%28f%29-%5Cwidehat%7BR%7D%28f%29%7C%5C%5C%0A%5Cend%7Baligned%7D%0A)
我们知道 -%5Cwidehat%7BR%7D(f)%7C#card=math&code=%5Cmathop%20%7B%5Cmathrm%7Bsup%7D%7D%20%5Climits_%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7CR%28f%29-%5Cwidehat%7BR%7D%28f%29%7C) 为两种误差的差异上界.
可以看到,当 越来越大时,差异上界也是在变大的,也就是说估计误差的上界在变大。
而 #card=math&code=R%28f%5E%2A%29) 会随着
的增大而变小,因此逼近误差在变小
偏差 - 方差分解
泛化误差的分解
我们假设样本服从 分布,其概率分布函数为
#card=math&code=p_r%28x%2Cy%29) ,可以写成
%5Csim%20%5Cmathcal%7BD%7D#card=math&code=%28x%2Cy%29%5Csim%20%5Cmathcal%7BD%7D) ,也可以写成
%5Csim%20p_r(x%2Cy)#card=math&code=%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29).
并且 Loss 函数采用平方损失函数,那么 #card=math&code=f%28x%29) 的泛化误差为
%3D%5Cunderset%7B(x%2C%20y)%20%5Csim%20p%7Br%7D(x%2C%20y)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B(y-f(x))%5E%7B2%7D%5Cright%5D%0A#card=math&code=R%28f%29%3D%5Cunderset%7B%28x%2C%20y%29%20%5Csim%20p%7Br%7D%28x%2C%20y%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%28y-f%28x%29%29%5E%7B2%7D%5Cright%5D%0A)
那么使得 #card=math&code=R%28f%29) 最小的最优模型
#card=math&code=f%5E%2A%28x%29) 为
%20%3D%5Cunderset%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7B%5Cmathrm%7Barg%7D%5Cmin%7DR(f)%3D%0A%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A#card=math&code=f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7Bf%5Cin%20%5Cmathcal%7BF%7D%7D%7B%5Cmathrm%7Barg%7D%5Cmin%7DR%28f%29%3D%0A%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A)
我们需要衡量模型 #card=math&code=f%28x%29) 与 真实标记
之间的误差期望:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%2Bf%5E-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%2Bf%5E%2A-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)
以上第2步到第3步之间,需要证明:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%3D0%0A#card=math&code=%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%3D0%0A)
首先,我们可以推出:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%20%5Cleft(%20f%5E*%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%3D0%0A#card=math&code=%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%3D0%0A)
证明如下:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cint_x%7Bf%5E*%5Cleft(%20x%20%5Cright)%20p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Ccdot%20y%7D%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Ccdot%20y%7D%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dx%7D%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A%5Cend%7Baligned%7D%0A)
然后由于 跟
是独立的,所以有:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%20%5Cright%5D%20%5Ccdot%20%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20y-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%0A%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%20%5Cright%5D%20%5Ccdot%20%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20y-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5Cright%5D%20%0A%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%0A%5C%5C%0A%5Cend%7Baligned%7D%0A)
之后只需证二者相等即可:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cint_x%7B%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2p_r%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%7D%5C%2C%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cleft(%20%5Cint_y%7By%7D%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E*%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cint_x%7B%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2p_r%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%7D%5C%2C%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cleft%28%20%5Cint_y%7By%7D%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cint_x%7Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)
回到原公式中来:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2BR%5E%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2BR%5E%2A%0A%5Cend%7Baligned%7D%0A)
其中第一项是当前模型和最优模型之间的误差,第二项 是最优模型和真实数据之间的误差,也就是贝叶斯误差,在
确定时,它为定值;
也就是说,只有当 时,
#card=math&code=f%28x%29) 的总期望损失
#card=math&code=R%28f%29) 最小。
最优函数的由来
这时我们回顾上文,会产生一个疑问:
%20%3D%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A#card=math&code=f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%0A)
这个函数是怎么来的?
我们已知 是使
%3D%5Cunderset%7B(x%2C%20y)%20%5Csim%20p%7Br%7D(x%2C%20y)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B(y-f(x))%5E%7B2%7D%5Cright%5D#card=math&code=R%28f%29%3D%5Cunderset%7B%28x%2C%20y%29%20%5Csim%20p%7Br%7D%28x%2C%20y%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%28y-f%28x%29%29%5E%7B2%7D%5Cright%5D) 最小的函数,但为什么是这个形式呢?
为什么不能是这种形式呢?
%20%3D%5Cunderset%7By%7D%7B%5Cmathrm%7Barg%7D%5Cmax%7D%5C%2Cp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%0A#card=math&code=%5Chat%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%3D%5Cunderset%7By%7D%7B%5Cmathrm%7Barg%7D%5Cmax%7D%5C%2Cp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%0A)
所以接下来,我们要对 的由来进行详细的推导:
我们假定 #card=math&code=f%28x%29) 是使
#card=math&code=R%28f%29) 最小的函数,那么对于
%0A#card=math&code=%5Ctilde%7Bf%7D%3Df%2B%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%0A)
我们设定一个 函数:
%20%26%3DR(%5Ctilde%7Bf%7D%3B%5Cvarepsilon%20)%5C%5C%0A%09%26%3D%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20(y-f(x)-%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20)%5E2%20%5Cright%5D%5C%5C%0A%09%26%3DR%5Cleft(%20f%20%5Cright)%20%2B%5Cvarepsilon%20%5E2%5Cunderset%7Bx%5Csim%20p_r(x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Ceta%20%5E2%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20%2B2%5Cvarepsilon%20%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%20%26%3DR%28%5Ctilde%7Bf%7D%3B%5Cvarepsilon%20%29%5C%5C%0A%09%26%3D%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%28y-f%28x%29-%5Cvarepsilon%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%29%5E2%20%5Cright%5D%5C%5C%0A%09%26%3DR%5Cleft%28%20f%20%5Cright%29%20%2B%5Cvarepsilon%20%5E2%5Cunderset%7Bx%5Csim%20p_r%28x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Ceta%20%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20%2B2%5Cvarepsilon%20%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)
由于 #card=math&code=f%28x%29) 是使
#card=math&code=R%28f%29) 最小的函数,那么必有
%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%3D0%0A#card=math&code=%5Cleft.%20%5Cfrac%7B%5Cpartial%20%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%3D0%0A)
因此可得:
%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%26%3D2%5Cunderset%7B(x%2Cy)%5Csim%20p_r(x%2Cy)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Cint_y%7B%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ccdot%20%5Ceta%20%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Ceta%7D%5Cleft(%20x%20%5Cright)%20%5Cint_y%7B%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cleft.%20%5Cfrac%7B%5Cpartial%20%5Cphi%20%5Cleft%28%20%5Cvarepsilon%20%5Cright%29%7D%7B%5Cpartial%20%5Cvarepsilon%7D%20%5Cright%7C%7B%5Cvarepsilon%20%3D0%7D%26%3D2%5Cunderset%7B%28x%2Cy%29%5Csim%20p_r%28x%2Cy%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20%5Ceta%20%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D2%5Cint_x%7B%5Ceta%7D%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A)
由 #card=math&code=%5Ceta%28x%29) 的任意性可知下式恒成立:
%20-y%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%3D0%0A#card=math&code=%5Cint_y%7B%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%3D0%0A)
又有:
%20%5Ccdot%20p_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%26%3Df%5Cleft(%20x%20%5Cright)%20%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3Df%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cint_y%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%26%3Df%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3Df%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%5Cend%7Baligned%7D%0A)
%20%5Cmathrm%7Bd%7Dy%7D%26%3Dp_r%5Cleft(%20x%20%5Cright)%20%5Cint_y%7By%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%26%3Dp_r%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%0A%5Cend%7Baligned%7D%0A)
于是可得:
%20%3D%5Cint_y%7By%5Ccdot%20p_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dy%7D%3D%5Cunderset%7By%5Csim%20p_r(y%5Cmid%20x)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A#card=math&code=f%5Cleft%28%20x%20%5Cright%29%20%3D%5Cint_y%7By%5Ccdot%20p_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dy%7D%3D%5Cunderset%7By%5Csim%20p_r%28y%5Cmid%20x%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%20%5Cright%5D%20%0A)
另一种证明
此外,关于泛化误差的分解,我还想出了另一种证明方法:
%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-y%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%5E2%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%2B%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D-2%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft(%20x%2Cy%20%5Cright)%20%5Ccdot%20y%5Ccdot%20f%5Cleft(%20x%20%5Cright)%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%20%5Cint_y%7Bp_r%5Cleft(%20y%5Cmid%20x%20%5Cright)%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20f%5E2%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft(%20x%20%5Cright)%20%5Ccdot%20p_r%5Cleft(%20x%20%5Cright)%5Ccdot%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20%5Cleft(%20f%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%5Cmathrm%7Bd%7Dx%7D-%5Cint_x%7Bp_r%5Cleft(%20x%20%5Cright)%20%5Ccdot%20%5Cleft(%20f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20-%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20-2%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%5Cleft(%20x%20%5Cright)%20%5Ccdot%20y%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft(%20x%2Cy%20%5Cright)%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f%5E-y%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft(%20x%20%5Cright)%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2BR%5E%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%5E2%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%2B%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D-2%5Cint_x%7B%5Cint_y%7Bp_r%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Ccdot%20y%5Ccdot%20f%5Cleft%28%20x%20%5Cright%29%7D%5Cmathrm%7Bd%7Dx%5Cmathrm%7Bd%7Dy%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%20%5Cint_y%7Bp_r%5Cleft%28%20y%5Cmid%20x%20%5Cright%29%20%5Ccdot%20y%7D%5Cmathrm%7Bd%7Dy%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20f%5E2%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D-2%5Cint_x%7Bf%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20p_r%5Cleft%28%20x%20%5Cright%29%5Ccdot%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20%5Cleft%28%20f%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%5Cmathrm%7Bd%7Dx%7D-%5Cint_x%7Bp_r%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20%5Cleft%28%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%5Cmathrm%7Bd%7Dx%7D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20-%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20y%5E2%20%5Cright%5D%20%2B%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20-2%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Ccdot%20y%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7B%5Cleft%28%20x%2Cy%20%5Cright%29%20%5Csim%20%5Cmathcal%7BD%7D%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f%5E%2A-y%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7Bx%5Csim%20p_r%5Cleft%28%20x%20%5Cright%29%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2BR%5E%2A%5C%5C%0A%5Cend%7Baligned%7D%0A)
可以看到,虽然是从另一种角度,但也得到了同样的结果。
训练模型的误差分解
在实际情况中,模型 #card=math&code=f%28x%29) 都是从某个训练集
上训练出来的,我们记为
#card=math&code=f_D%28x%29) .
同时,我们设定一个在不同训练集上的期望模型 .
对于单个样本 ,
#card=math&code=f_D%28x%29) 与
#card=math&code=f%5E%2A%28x%29) 在不同训练集
上的期望误差为:
%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D-%5Cbar%7Bf%7D%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D-%5Cbar%7Bf%7D%20%5Cright)%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B2%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%2B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%5C%5C%0A%5Cend%7Baligned%7D%0A)
第1行到第2行的转化中需要证明:
%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E*%20%5Cright)%20%5Cright%5D%20%3D0%0A#card=math&code=%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%20%3D0%0A)
证明很简单,只需展开即可:
%20%5Cleft(%20%5Cbar%7Bf%7D-f%5E%20%5Cright)%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cbar%7Bf%7D-f_Df%5E-%5Cbar%7Bf%7D%5E2%2B%5Cbar%7Bf%7Df%5E%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft(%20x%20%5Cright)%20%5Cright%5D%20-%5Cbar%7Bf%7D%5E2%5Cleft(%20x%20%5Cright)%20%2Bf%5E%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5E2%5Cleft(%20x%20%5Cright)%20%2Bf%5E*%5Cleft(%20x%20%5Cright)%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%09%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D-%5Cbar%7Bf%7D%20%5Cright%29%20%5Cleft%28%20%5Cbar%7Bf%7D-f%5E%2A%20%5Cright%29%20%5Cright%5D%20%26%3D%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cbar%7Bf%7D-f_Df%5E%2A-%5Cbar%7Bf%7D%5E2%2B%5Cbar%7Bf%7Df%5E%2A%20%5Cright%5D%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20f_D%5Cleft%28%20x%20%5Cright%29%20%5Cright%5D%20-%5Cbar%7Bf%7D%5E2%5Cleft%28%20x%20%5Cright%29%20%2Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%09%26%3D%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5E2%5Cleft%28%20x%20%5Cright%29%20%2Bf%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%5C%5C%0A%09%26%3D0%5C%5C%0A%5Cend%7Baligned%7D%0A)
回到原式中来:
%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%3D%5Cunderset%7B%5Cmathrm%7Bvariance%7D.%5Cmathrm%7Bx%7D%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20f_D%5Cleft(%20x%20%5Cright)%20-%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%7D%7D%2B%5Cunderset%7B%5Cleft(%20%5Cmathrm%7Bbias%7D.%5Cmathrm%7Bx%7D%20%5Cright)%20%5E2%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft(%20%5Cbar%7Bf%7D%5Cleft(%20x%20%5Cright)%20-f%5E%5Cleft(%20x%20%5Cright)%20%5Cright)%20%5E2%20%5Cright%5D%20%7D%7D%0A#card=math&code=%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%3D%5Cunderset%7B%5Cmathrm%7Bvariance%7D.%5Cmathrm%7Bx%7D%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20f_D%5Cleft%28%20x%20%5Cright%29%20-%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%7D%7D%2B%5Cunderset%7B%5Cleft%28%20%5Cmathrm%7Bbias%7D.%5Cmathrm%7Bx%7D%20%5Cright%29%20%5E2%7D%7B%5Cunderbrace%7B%5Cunderset%7BD%5Csim%20%5Cmathcal%7BD%7D%5Em%7D%7B%5Cmathbb%7BE%7D%7D%5Cleft%5B%20%5Cleft%28%20%5Cbar%7Bf%7D%5Cleft%28%20x%20%5Cright%29%20-f%5E%2A%5Cleft%28%20x%20%5Cright%29%20%5Cright%29%20%5E2%20%5Cright%5D%20%7D%7D%0A)
第一项为方差(Variance),衡量一个模型在不同训练集上的波动,如果方差过大,说明可能过拟合。
第二项为偏差(Bias),衡量一个算法学到模型的平均性能与最优模型之间的差异

对于固定大小的数据集,方差和偏差之间有一个取舍关系:
- 模型复杂度越大,拟合能力越强,那么偏差越小,但方差就会越大
- 模型复杂度越小,方差会减小,但偏差就会变大
例如,当我们给模型加入一个正则化项时,正则化项权重 越大,学到的模型结构越简单,方差减小,避免过拟合,但由于正则化项的影响,会使偏差变大。
因此,我们需要在偏差和方差之间取得比较好的平衡,使得整体误差最小。

