• 假设检验:先假设一个结果(预估一个结果),再进行检验
    • 有点类似于反证法。
  • 步骤(流程):
    • Step-1:State null and alternative hypotheses
    • Step-2:Identify the test statistic(检验统计量)
    • Step-3:Select a level of significance
    • Step-4:Formulate a decision rule
    • Step-5:Take a sample, arrive at decision
      • Do not reject
      • Reject
  • 步骤流程(version-2):

    • 1)设假设,(07)Hypothesis testing - 图1(07)Hypothesis testing - 图2
    • 2)画分布,判断是对均值((07)Hypothesis testing - 图3)还是方差((07)Hypothesis testing - 图4)进行检验(两者的分布是不一样的)
    • 3)找到拒绝域,单尾还是双尾
    • 4)拒绝域的面积,即确定(07)Hypothesis testing - 图5(显著性水平),实际应用中自己拟定,考试中一般会给出
    • 5)基于确定的(07)Hypothesis testing - 图6找到关键值((07)Hypothesis testing - 图7
    • 6)计算检验统计量(test statistic)
    • 7)比较检验统计量与关键值的大小,判断是否落在拒绝域
    • 8)做决定:
      • Do not reject(Fail to reject)(07)Hypothesis testing - 图8
      • Reject(07)Hypothesis testing - 图9

        1、Define Hypothesis

  • A htpothesis is a statement about one or more population parameters.

    • For population, not sample
  • Null hypothesis(原假设):the fact we suspect and want to reject
    • (07)Hypothesis testing - 图10
  • Alternative hypothesis(备择假设):we want to assess
    • (07)Hypothesis testing - 图11
  • 原假设与备择假设为互补关系。
  • one-tailed/two-tailed tests of Hypothesis

    • Two-tailed(双尾):
      • (07)Hypothesis testing - 图12(07)Hypothesis testing - 图13
      • 拒绝域在两边都有
    • One-tailed(单尾):
      • (07)Hypothesis testing - 图14(07)Hypothesis testing - 图15
      • 或:
      • (07)Hypothesis testing - 图16(07)Hypothesis testing - 图17
      • 拒绝域在一边,备择假设如果是大于号((07)Hypothesis testing - 图18),则拒绝域在右边;如果备择假设是小于号((07)Hypothesis testing - 图19),则拒绝域在左边。
        • 解析:因为想要接收的放在了备择假设里,想要拒绝的放在原假设里;因此,如果备择假设成立,其成立的情况对应的区域即为原假设的拒绝域。
    • 所有的等于号均放在原假设里。

      2、Test-statistic(检验统计量,z-statistic)

  • 计算公式

    • (07)Hypothesis testing - 图20
    • (07)Hypothesis testing - 图21
    • (07)Hypothesis testing - 图22
      • 公式由来:
        • 现在是想用(07)Hypothesis testing - 图23估计总体均值(07)Hypothesis testing - 图24,要判断落在区间里还是区间外,需要查表。
        • 由中心极限定理可知(07)Hypothesis testing - 图25,将其进行标准化,可得:(07)Hypothesis testing - 图26,此时即符合 (07)Hypothesis testing - 图27,标准化的结果即作为检验统计量。最终基于该统计量即可判断其落在区间里还是区间外。
          • 其中(07)Hypothesis testing - 图28代入原假设中设定的数值。
  • Test statistic follows Normal、T、Chi Square or F distributions
    • 检验统计量服从各种分布(检验统计量也是一个随机变量)
    • (07)Hypothesis testing - 图29
      • (07)Hypothesis testing - 图30是个随机变量,因此检验统计量((07)Hypothesis testing - 图31)也是随机变量
  • Test statistic has formula.(检验统计量是通过计算得到的)
    • Calculate it with the sample data.
    • We should emphasize Test Statistic is calculated by ourselves not from the table.
  • This is the general formula but only for Z and T distribution.

    • 以下两个公式只能用于 Z 分布和 T 分布:
      • (07)Hypothesis testing - 图32
      • (07)Hypothesis testing - 图33

        3、Critical value(关键值、即分位数)

  • 计算好检验统计量后,需要判断该统计量是落在区间里还是区间外,就需要将其与关键值((07)Hypothesis testing - 图34值)进行对比。

  • 关键值((07)Hypothesis testing - 图35值)的影响因素
    • 与显著性水平((07)Hypothesis testing - 图36)有关
      • 由显著性水平可确定置信度((07)Hypothesis testing - 图37),因此确定区间大小,确定(07)Hypothesis testing - 图38
    • 之前讲过:相同的显著性水平((07)Hypothesis testing - 图39)下,T 分布的置信区间更宽;因此,不同分布(查不同的表)的(07)Hypothesis testing - 图40值也不同。
    • 与单尾或双尾相关:
      • 双尾(07)Hypothesis testing - 图41(每个尾巴的面积为(07)Hypothesis testing - 图42),置信度为(07)Hypothesis testing - 图43,对应的(07)Hypothesis testing - 图44
      • 单尾(07)Hypothesis testing - 图45(一个尾巴的面积就达到(07)Hypothesis testing - 图46),则对应双尾情况下显著性水平为(07)Hypothesis testing - 图47时对应的(07)Hypothesis testing - 图48值,此时置信度为(07)Hypothesis testing - 图49,对应的(07)Hypothesis testing - 图50
      • Under given one tailed or two tailed assumption, critical value is determined solely by the significance level.
  • Found in the Z、T、Chi Square or F distribution tables,not calculated by us.

    • 关键值((07)Hypothesis testing - 图51值)是通过查表得到。

      4、Decision rule

  • Critical value method

    • Significance Level?
    • Two tailed or one tailed test?
    • Reject region? Critical Value under the condition
    • Compare the Test Statistic and Critical Value
  • Reject(07)Hypothesis testing - 图52if (07)Hypothesis testing - 图53
  • Fail to reject(07)Hypothesis testing - 图54if (07)Hypothesis testing - 图55

image.png

  • Statement:

    • cannot say “accept the null hypothesis”, only can say “cannot reject”
    • is significantly different from
    • is not significantly different from

      5、Confidence Intervals 和 Hypothesis Tests 的关系

  • Confidence Interval = sample statistic(07)Hypothesis testing - 图57critical value * standard error

  • Center of Interval = sample statistic
  • Length of Interval = 2 critical value standard error

    6、假设检验:对均值、方差进行检验的总结

  • 对均值((07)Hypothesis testing - 图58)还是方差((07)Hypothesis testing - 图59)进行检验时,两者的分布是不一样的

  • 以下表格只需要知道对均值、方差进行检验时使用对应的分布即可
  • (07)Hypothesis testing - 图60进行检验
    • 一个总体(07)Hypothesis testing - 图61(如检验(07)Hypothesis testing - 图62
      • T 分布或 Z 分布(正太分布)
        • 方差已知用 Z ,方差未知用 T,非正太、总体小样本不可估计
        • 当样本足够大时,无论是方差已知还是未知,都可以用 Z
      • 对应的检验统计量即之前讲过的 test-statistic 相关计算【必须掌握】
    • 两个总体(07)Hypothesis testing - 图63(如检验(07)Hypothesis testing - 图64
      • 两个总体之间相互独立:T 分布
      • 两个总体之间不相互独立:使用 paired comparisons test(成对数检验,分成一段一段检验)
        • 对应的检验统计量为:
          • (07)Hypothesis testing - 图65
            • 样本的标准差与总体的标准差之间的关系:(07)Hypothesis testing - 图66(但通常(07)Hypothesis testing - 图67未知时用(07)Hypothesis testing - 图68代替)
          • 推导(转换为检验(07)Hypothesis testing - 图69):
            • 假设要检验(07)Hypothesis testing - 图70,其中 X 和 Y 之间不相互独立(存在相关性),因此检验时将其分段、成对匹配,并计算每一对的差值(07)Hypothesis testing - 图71,检验(07)Hypothesis testing - 图72即可转换为检验(07)Hypothesis testing - 图73
              • (07)Hypothesis testing - 图74 (07)Hypothesis testing - 图75 (07)Hypothesis testing - 图76
              • (07)Hypothesis testing - 图77 (07)Hypothesis testing - 图78 (07)Hypothesis testing - 图79
              • (07)Hypothesis testing - 图80
              • (07)Hypothesis testing - 图81 (07)Hypothesis testing - 图82 (07)Hypothesis testing - 图83
            • 此时,对应的检验统计量为(07)Hypothesis testing - 图84
  • 对方差((07)Hypothesis testing - 图85)进行检验
    • 一个总体(07)Hypothesis testing - 图86(如:(07)Hypothesis testing - 图87):卡方分布
    • 两个总体(07)Hypothesis testing - 图88(如:(07)Hypothesis testing - 图89):F 分布
    • 两个分布对应的检验统计量简单了解即可。

image.png

7、p-value Method

  • The p-value is the smallest level of significance at which the null hypothesis can be reject.
    • 即:p-value 为拒绝原假设的最小显著性水平((07)Hypothesis testing - 图91),即两端尾巴(如果是单尾,则一端的尾巴)的面积(即为概率,因此取值范围为 0-1)
    • 因此,可以将 p-value 与显著性水平((07)Hypothesis testing - 图92)进行对比,以判断是否拒绝原假设。
  • (07)Hypothesis testing - 图93:reject(07)Hypothesis testing - 图94
  • (07)Hypothesis testing - 图95:do not reject(07)Hypothesis testing - 图96
  • As p-value decrease, easier to reject(07)Hypothesis testing - 图97(p-value 越小越拒绝)

    8、Type-1 error 和 Type-2 error【必考】

  • Type-1 error:reject the null hypothesis when it’s actually true.

    • 拒真(错杀好人),原假设正确,但被拒绝掉
    • P(Type-1 error) = Significance level
    • Significance level((07)Hypothesis testing - 图98):the probability of making a Type-1 error.
  • Type-2 error:fail to reject the null hypothesis when it’s actually false.
    • 存伪(放走坏人),原假设错误,但没有被拒绝
    • Power of a test(检验的能力):the probability of correctly rejecting the null hypothesis when it is false.
      • 即原假设错误、且正好也被拒绝掉的概率(用此来体现检验的能力,因为原假设本身就是想把他拒绝掉的)。
      • power of a test + P(Type-2 error) = 1
        • 因为“原假设错误、且正好也被拒绝”与“原假设错误、但没有被拒绝”构成了互补关系,两者的概率之和必定位 1。
      • 因此:Power of a test = 1 - P(Type-2 error)
    • P(Type-1 error) + P(Type-2 error) 是否等于 1?
      • 不等于,因为 P(Type-1 error) 不等价于 Power of a test(两者从函数上就体现出了区别),而 P(Type-2 error) 是与 Power of a test(也是概率)之和才为 1。

image.png

  • With other conditions unchanged, either error probability arises at the cost of the other error probability decreasing.
  • How to reduce both errors?

    • Increase the Sample Size.

      9、parametric tests、nonparametric tests

  • parametric tests(参数检验)

    • Based on specific distributional assumptions for the population
    • concerning a parameter of population.
    • For example, t-test.
  • nonparametric tests(非参数检验)

    • a nonparametric test either is not concerned with a parameter or makes minimal assumptions about the population from which the sample comes.
    • Nonparametric tests are used when:
      • data do not meet distributional assumptions.(不满足分布的假设;如非正太、总体小样本的情况下)
        • 如:hypothesis test of the mean value for a variable, but the distribution of the variable is not normal and the sample size is small so that neither the t-test nor the z-test are appropriate.
      • data are given in ranks.(序数排列)
      • the hypothesis we are addressing does not concern a parameter.(检验的不是参数)

        10、例题

        (1)Define Hypothesis

        (a)原假设、备择假设

        image.png

        (b)备择假设、原假设

        image.png

        (2)假设检验:流程、拒绝与否的判断

        (a)假设检验流程应用(双尾)

        image.png
        image.png

        (b)假设检验流程应用(单尾)

        image.png

        (c)假设的拒绝与否判断

        image.png

        (3)假设检验:t-test(成对数检验)

  • 含金融计算器计算一组数的均值和方差(标准差)

image.png

(4)假设检验:Chi-square(卡方)test

image.png
image.png
image.png
image.png

(5)假设检验:F-test for equal variances

image.png
image.png
image.png

  • Decision rule for F-test

image.png

(6)不同假设检验的比较

  • 两个不同方差检验,使用 F 分布
  • 成对数检验特点

image.png

(7)p-value Method

image.png

  • C 的描述中,应将最大概率,表述为最小概率。

    (8)Type-1 error 和 Type-2 error

    (a)Type-2 error

    image.png

    (b)Type-1 error 和 Type-2 error

    image.png

    (c)Type-1 error 和 Type-2 error 此消彼长

    image.png

    (d)power of test 的理解

    image.png