非参数Bootstrap方法

非参数Bootstrap方法

设总体的分布未知，但按放回抽样的方法抽取了一个容量为的样本，称为Bootstrap样本或称为自助样本。独立地取多个Bootstrap样本，利用这些样本信息对总体进行推断，这种方法称为非参数Bootstrap方法，又称为自助法。这一方法可用于对总体知之甚少地情况。
优点：不需要对总体分布有任何假设，而且可以使用于小样本，且能用于各种统计量。

估计量的标准误差的Bootstrap估计
在估计总体位置参数时，不仅要给出的估计，还需要这一估计的标准误差，常用度量。

步骤

相继、独立地从已知的容量为的样本中抽出个容量为的Bootstrap样本

Bootstrap方法 - 图11

计算

Bootstrap方法 - 图12
式中： Bootstrap方法 - 图13

估计量的均方误差的Bootstrap估计

例1

有30窝仔猪出生时各窝猪的存活只数为

9 8 10 12 11 12 7 9 11 8 9 7 7 8 9 7 9 9 10 9 9 9 12 10 10 9 13 11 13 9
（1）试求中位数估计 Bootstrap方法 - 图14 的标准误差的Bootstrap估计。
（2）求均方误差 Bootstrap方法 - 图15 的估计。

data=[9  8  10  12  11  12  7  9  11  8  9  7  7  8  9  7  9  9  10  9  9  9  12  10  10  9  13  11  13  9];
b=bootstrp(1000,@(x)quantile(x,0.5),data);%1000组样本，@(x)quantile(x,0.5)求中位数的函数，data原始数据
b_std=std(b);%计算b的标准差
b_var=mean((b-quantile(data,0.5)).^2);%求均方误差
b1=bootstrp(1000,@mean,data);%平均数
b1_mean=std(b1);

Bootstrap置信区间

设是来自总体容量为的样本，是一个已知的样本值。中含有位置参数，是的估计量。现在求的置信水平为的置信区间。
步骤及原理

独立从样本中抽出个容量为的Bootstrap样本，对于每个Bootstrap样本求出的Bootstrap估计：。
将估计从小到大排序：。
求出近似分位数使得

Bootstrap方法 - 图33
于是近似的有
Bootstrap方法 - 图34

记，以和分别作为分位数的估计，得到近似等式

Bootstrap方法 - 图39
得到 Bootstrap方法 - 图40 的置信水平为 Bootstrap方法 - 图41 的Bootstrap置信区间为 Bootstrap方法 - 图42 ，这种方法称为分位数法。

续例1

以样本均值作为总体均值的估计，以标准差作为总体标准差的估计，按分位数法求的置信水平为0.90的Bootstrap置信区间。 ```matlab b=bootci(1000,{@(x)[mean(x),std(x)],data},’alpha’,0.1)%bootci得到Bootstrap置信区间

结果：第一列是均值的置信区间，第二列是标准差的置信区间 b = 9.0667 1.4368 10.0667 2.0609

<a name="ZoTrc"></a>
# 参数Bootstrap方法
- 假设所研究的总体的分布函数![](https://cdn.nlark.com/yuque/__latex/76fc2327a11fd82781b568cdc303c8c5.svg#card=math&code=F%3D%28x%3B%5Cbeta%29&id=T6B3f)的形式已知，但其中高喊未知参数![](https://cdn.nlark.com/yuque/__latex/6100158802e722a88c15efc101fc275b.svg#card=math&code=%5Cbeta&id=tKfPW)。现在已知有一个来自总体的样本![](https://cdn.nlark.com/yuque/__latex/5c6bd053d8066e9050ea9f44de2dfb53.svg#card=math&code=X_1%2CX_2%2C...%2CX_n&id=Agpll)。
<a name="e4NVv"></a>
## 步骤
1. 由该样本得到![](https://cdn.nlark.com/yuque/__latex/6100158802e722a88c15efc101fc275b.svg#card=math&code=%5Cbeta&id=X2E1z)的极大似然估计![](https://cdn.nlark.com/yuque/__latex/51a7a25e702b72824ce48abefd313148.svg#card=math&code=%5Chat%5Cbeta&id=Ee4co)。
1. 在总体分布为![](https://cdn.nlark.com/yuque/__latex/2b327e9a1d7648961685b0ea4f9b9f64.svg#card=math&code=F%28x%3B%5Chat%5Cbeta%29&id=c1dmP)中产生![](https://cdn.nlark.com/yuque/__latex/cf7e9426b92e614e1001ac4c7d10553c.svg#card=math&code=B%28B%5Cge1000%29&id=gcdp8)个容量为![](https://cdn.nlark.com/yuque/__latex/df378375e7693bdcf9535661c023c02e.svg#card=math&code=n&id=mXGwW)的样本
![](https://cdn.nlark.com/yuque/__latex/bc62f2d5ab2689739f586d2b23a664a8.svg#card=math&code=X_1%5E%2A%2CX_2%5E%2A%2C...%2CX_n%5E%2A%5Csim%20F%28x%3B%5Chat%5Cbeta%29&id=HmgYD)
3. 用非参数Bootstrap置信区间的方法得到![](https://cdn.nlark.com/yuque/__latex/6100158802e722a88c15efc101fc275b.svg#card=math&code=%5Cbeta&id=NlHrH)的Bootstrap置信区间。
<a name="uvau1"></a>
## 例2：
- 已知某电子原件的寿命服从威布尔分布，其分布函数如下
![](https://cdn.nlark.com/yuque/__latex/3401e7788bcf12183d620b0bf94f9e67.svg#card=math&code=F%28x%29%3D%5Cbegin%7Bcases%7D%0A1-e%5E%7B-%28x%2F%5Ceta%29%5E%7B%5Cbeta%7D%7D%26x%3E0%5C%5C%0A0%26%E5%85%B6%E4%BB%96%0A%5Cend%7Bcases%7D%5Cquad%5Cbeta%3E0%2C%5Ceta%3E0&id=FmtaU)<br />已知参数![](https://cdn.nlark.com/yuque/__latex/e358cc6df88438eb05d45f1f21cf9119.svg#card=math&code=%5Cbeta%3D2&id=HnLGe)。今有样本142.84 97.04 32.46 69.14 85.67 114.43 41.76 163.07 108.22 63.28。<br />（1）确定参数![](https://cdn.nlark.com/yuque/__latex/7483c6745bb07f292eba02b3a9b55c26.svg#card=math&code=%5Ceta&id=fYO9g)的最大似然估计。<br />（2）对于时刻![](https://cdn.nlark.com/yuque/__latex/4ab6dfa69c2d4aa83355e98caf4a614e.svg#card=math&code=t_0%3D50&id=xs1dA)，求可靠性![](https://cdn.nlark.com/yuque/__latex/e11e280bb8d4dac1611bbe44f7526927.svg#card=math&code=R%2850%29%3D1-F%2850%29%3De%5E%7B-%2850%2F%5Ceta%29%5E2%7D&id=fQ6I3)的置信水平为0.95的Bootstrap单侧置信区间。<br />解：<br />（1）求似然估计是概率论中的基础，不在此详细阐述，结果为![](https://cdn.nlark.com/yuque/__latex/5fc09cc9a285eb5b8e825c5c3f9a82e8.svg#card=math&code=%5Chat%5Ceta%3D%5Csqrt%7B%5Cfrac%7B%5Csum_%7Bi%3D1%7D%5Enx_i%5E2%7D%7Bn%7D%7D%20%3D100.0696&id=Xyp7j)。<br />（2）对于参数![](https://cdn.nlark.com/yuque/__latex/441a7f99351e742db82af3fe354a52a3.svg#card=math&code=%5Cbeta%3D2%2C%5Ceta%3D%5Chat%5Ceta%3D100.0696&id=pAAgg)，产生服从对应韦布尔分布的5000个容量为10的Bootstrap样本。<br />对于每个样本计算![](https://cdn.nlark.com/yuque/__latex/7483c6745bb07f292eba02b3a9b55c26.svg#card=math&code=%5Ceta&id=SPkoK)的Bootstrap估计![](https://cdn.nlark.com/yuque/__latex/8ba20e3499693b91477d29dd8ce4ee95.svg#card=math&code=%5Chat%5Ceta_i%5E%2A%3D%5Csqrt%7B%5Cfrac%7B%5Csum_%7Bj%3D1%7D%5E%7B10%7D%28x_j%5E%7B%2Ai%7D%29%5E2%7D%7B10%7D%7D&id=MLthW)。<br />将5000个![](https://cdn.nlark.com/yuque/__latex/614d3361983267b41fc0d35f2b5f2222.svg#card=math&code=%5Ceta_i%5E%2A&id=NtfKl)自小到大排列，取坐起第250（[5000×0.05]=250）位，得![](https://cdn.nlark.com/yuque/__latex/bf3015fd98fcb26131cbf39478ba19b1.svg#card=math&code=%5Ceta_%7B%28250%29%7D%5E%2A%3D73.3758&id=CGkb6)。<br />所以置信水平位0.95得Bootstrap单侧置信下限为![](https://cdn.nlark.com/yuque/__latex/f54be1abd2e5f84cda52ae55683412ce.svg#card=math&code=e%5E%7B-%2850%2F%5Chat%5Ceta%5E%2A_%7B%28250%29%7D%29%5E2%7D%3D0.6286&id=Skbwo)。
```matlab
clc,clear
a=[142.84  97.04  32.46  69.14  85.67  114.43  41.76  163.07  108.22  63.28];
eta=sqrt(mean(a.^2));
beta=2;B=5000;alpha=0.05;
b=wblrnd(eta,beta,[B,10]);%产生shape=(B,10)的韦布尔分布的随机数
etahat=sqrt(mean(b.^2,2));%计算每个样本对应的最大似然估计
seteta=sort(etahat);%把etahat从小到大排序
k=floor(B*alpha);%取整数部分
se=seteta(k);%提取相应位置的估计量
Rt0=exp(-(50/se)^2);

Bootstrap方法

非参数Bootstrap方法

估计量的标准误差的Bootstrap估计

步骤

估计量的均方误差的Bootstrap估计

例1

Bootstrap置信区间

步骤及原理

续例1