lec-09.pdf

定义

对于概率密度函数 Fisher 信息矩阵 - 图1#card=math&code=p%5Ctheta%28x%29) ,其log似然函数为 ![](https://g.yuque.com/gr/latex?%7B%5Cell%7D%7B%5Ctheta%7D(x)%3D%5Clog%20p%7B%5Ctheta%7D(x)#card=math&code=%7B%5Cell%7D%7B%5Ctheta%7D%28x%29%3D%5Clog%20p_%7B%5Ctheta%7D%28x%29).

定义 Fisher information:

Fisher 信息矩阵 - 图2

其中 Fisher 信息矩阵 - 图3#card=math&code=%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%3D%5Cnabla%20%7B%5Ctheta%7D%5Clog%20p_%7B%5Ctheta%7D%28X%29)

:::info Fisher 信息矩阵 - 图4 本身就刻画了信息量的大小,Fisher 矩阵刻画了信息量的变化情况。 :::

一些性质

:::info Fisher 信息矩阵 - 图5 的期望为0 :::

Fisher 信息矩阵 - 图6%20%5Cnabla%7B%5Ctheta%7D%20%5Clog%20p%7B%5Ctheta%7D(x)%20d%20x%3D%5Cint%20%5Cfrac%7B%5Cnabla%20p%7B%5Ctheta%7D(x)%7D%7Bp%7B%5Ctheta%7D(x)%7D%20p%7B%5Ctheta%7D(x)%20d%20x%20%5C%5C%0A%26%3D%5Cint%20%5Cnabla%20p%7B%5Ctheta%7D(x)%20d%20x%20%5Cstackrel%7B(%5Cstar)%7D%7B%3D%7D%20%5Cnabla%20%5Cint%20p%7B%5Ctheta%7D(x)%20d%20x%3D%5Cnabla%201%3D0%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cmathbb%7BE%7D%7B%5Ctheta%7D%5Cleft%5B%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%5Cright%5D%20%26%3D%5Cint%20p%7B%5Ctheta%7D%28x%29%20%5Cnabla%7B%5Ctheta%7D%20%5Clog%20p%7B%5Ctheta%7D%28x%29%20d%20x%3D%5Cint%20%5Cfrac%7B%5Cnabla%20p%7B%5Ctheta%7D%28x%29%7D%7Bp%7B%5Ctheta%7D%28x%29%7D%20p%7B%5Ctheta%7D%28x%29%20d%20x%20%5C%5C%0A%26%3D%5Cint%20%5Cnabla%20p%7B%5Ctheta%7D%28x%29%20d%20x%20%5Cstackrel%7B%28%5Cstar%29%7D%7B%3D%7D%20%5Cnabla%20%5Cint%20p_%7B%5Ctheta%7D%28x%29%20d%20x%3D%5Cnabla%201%3D0%0A%5Cend%7Baligned%7D%0A)

其中 Fisher 信息矩阵 - 图7#card=math&code=%28%5Cstar%29) 处假设了积分和求导可以交换顺序。 :::info 可以用 Fisher 信息矩阵 - 图8 的 Hessian 阵来定义 Fisher 信息矩阵 :::

Fisher 信息矩阵 - 图9%3D%5Cfrac%7B%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D(x)%7D%7Bp%7B%5Ctheta%7D(x)%7D-%5Cfrac%7B%5Cnabla%20p%7B%5Ctheta%7D(x)%20%5Cnabla%20p%7B%5Ctheta%7D(x)%5E%7B%5Ctop%7D%7D%7Bp%7B%5Ctheta%7D(x)%5E%7B2%7D%7D%3D%5Cfrac%7B%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D(x)%7D%7Bp%7B%5Ctheta%7D(x)%7D-%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%20%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%5E%7B%5Ctop%7D%0A#card=math&code=%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D%28x%29%3D%5Cfrac%7B%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D%28x%29%7D%7Bp%7B%5Ctheta%7D%28x%29%7D-%5Cfrac%7B%5Cnabla%20p%7B%5Ctheta%7D%28x%29%20%5Cnabla%20p%7B%5Ctheta%7D%28x%29%5E%7B%5Ctop%7D%7D%7Bp%7B%5Ctheta%7D%28x%29%5E%7B2%7D%7D%3D%5Cfrac%7B%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D%28x%29%7D%7Bp%7B%5Ctheta%7D%28x%29%7D-%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%20%5Cdot%7B%5Cell%7D_%7B%5Ctheta%7D%5E%7B%5Ctop%7D%0A)
然后可以得到 Fisher 信息矩阵的等价表达式:

Fisher 信息矩阵 - 图10%20%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D(x)%20d%20x%2B%5Cint%20%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D(x)%20d%20x%20%5C%5C%0A%26%3D-%5Cmathbb%7BE%7D%5Cleft%5B%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D(x)%5Cright%5D%2B%5Cnabla%5E%7B2%7D%20%5Cunderbrace%7B%5Cint%20p%7B%5Ctheta%7D(x)%20d%20x%7D%7B%3D1%7D%3D-%5Cmathbb%7BE%7D%5Cleft%5B%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D(x)%5Cright%5D%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0AI%7B%5Ctheta%7D%20%26%3D%5Cmathbb%7BE%7D%7B%5Ctheta%7D%5Cleft%5B%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%20%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%5E%7B%5Ctop%7D%5Cright%5D%3D-%5Cint%20p%7B%5Ctheta%7D%28x%29%20%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D%28x%29%20d%20x%2B%5Cint%20%5Cnabla%5E%7B2%7D%20p%7B%5Ctheta%7D%28x%29%20d%20x%20%5C%5C%0A%26%3D-%5Cmathbb%7BE%7D%5Cleft%5B%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D%28x%29%5Cright%5D%2B%5Cnabla%5E%7B2%7D%20%5Cunderbrace%7B%5Cint%20p%7B%5Ctheta%7D%28x%29%20d%20x%7D%7B%3D1%7D%3D-%5Cmathbb%7BE%7D%5Cleft%5B%5Cnabla%5E%7B2%7D%20%5Clog%20p_%7B%5Ctheta%7D%28x%29%5Cright%5D%0A%5Cend%7Baligned%7D%0A)

:::info Fisher 信息矩阵 - 图11也刻画了Fisher 信息矩阵 - 图12 的方差(方差越大,说明收集到的信息越多) ::: Fisher 信息矩阵 - 图13

小结

Fisher 信息矩阵 - 图14%5Cright%5D%0A#card=math&code=I%7B%5Ctheta%7D%3D%5Cmathbb%7BE%7D%7B%5Ctheta%7D%5Cleft%5B%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%20%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%5Cright%5D%3D-%5Cmathbb%7BE%7D%7B%5Ctheta%7D%5Cleft%5B%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D%28X%29%5Cright%5D%0A)

:::info 当样本 Fisher 信息矩阵 - 图15 的数目增加时,Fisher 信息矩阵 - 图16 是线性增大的 :::

例子

Example 8.1 (Canonical exponential family): In a canonical exponential family model, we have Fisher 信息矩阵 - 图17%3D%5Clangle%5Ctheta%2C%20%5Cphi(x)%5Crangle-A(%5Ctheta)%2C#card=math&code=%5Clog%20p%7B%5Ctheta%7D%28x%29%3D%5Clangle%5Ctheta%2C%20%5Cphi%28x%29%5Crangle-A%28%5Ctheta%29%2C) where Fisher 信息矩阵 - 图18 is the sufficient statistic and Fisher 信息矩阵 - 图19 is the log-partition function. Because ![](https://g.yuque.com/gr/latex?%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%3D%5Cphi(x)-%5Cnabla%20A(%5Ctheta)#card=math&code=%5Cdot%7B%5Cell%7D%7B%5Ctheta%7D%3D%5Cphi%28x%29-%5Cnabla%20A%28%5Ctheta%29) and ![](https://g.yuque.com/gr/latex?%5Cnabla%5E%7B2%7D%20%5Clog%20p%7B%5Ctheta%7D(x)%3D-%5Cnabla%5E%7B2%7D%20A(%5Ctheta)#card=math&code=%5Cnabla%5E%7B2%7D%20%5Clog%20p_%7B%5Ctheta%7D%28x%29%3D-%5Cnabla%5E%7B2%7D%20A%28%5Ctheta%29) is a constant, we obtain

Fisher 信息矩阵 - 图20%0A#card=math&code=I_%7B%5Ctheta%7D%3D%5Cnabla%5E%7B2%7D%20A%28%5Ctheta%29%0A)

拓展阅读

  1. Fisher Information - Stanford University https://web.stanford.edu › stats311 › Lectures › lec-09
  2. 费雪信息 (Fisher information) 的直观意义是什么? - 知乎 https://www.zhihu.com/question/26561604
  3. 深度模型从研者 眼里的 似然估计 & Hessain 海森矩阵 & Fisher Information (费雪信息)

最后引用一个知乎大佬的回答:

:::info 提供一个思路,在信息几何(Information Geometry)这一学科中,概率密度函数族可以看做与参数空间同胚的黎曼流形,Fisher信息矩阵可以看做是统计流形上的黎曼度量,可以证明这一度量是外围欧式空间在流形上的诱导度量。进一步计算可以得到,一维正态分布函数族对应的流形具有-1/2的常曲率,为一双曲流形。这一思路似乎是Rao先提出的,就是Cramer-Rao里的那个Rao。 :::

作者:刘大 链接:https://www.zhihu.com/question/26561604/answer/93809082