PART 1 概率推理 - 2 表示 Representation - 《决策算法笔记》

1. 信度Blief
2. 概率分布
3. 联合分布
4. 条件分布

1. 信度Blief

命题 2 表示 Representation - 图1 比命题 2 表示 Representation - 图2 更可信： 2 表示 Representation - 图3
2 表示 Representation - 图4 当且仅当 2 表示 Representation - 图5
2 表示 Representation - 图6 当且仅当 2 表示 Representation - 图7

2. 概率分布

概率分布：为不同的结果分配概率。分为离散和连续。

离散概率分布

描述一组离散值的分布，表示为概率质量函数probability mass function
变量 2 表示 Representation - 图8 取值为 2 表示 Representation - 图9 个不同的值 2 表示 Representation - 图10 之一，使用冒号表示为 2 表示 Representation - 图11
2 表示 Representation - 图12 代表 2 表示 Representation - 图13 ， 2 表示 Representation - 图14 代表 2 表示 Representation - 图15
概率质量函数必须满足： 2 表示 Representation - 图16 ， 2 表示 Representation - 图17

连续概率分布

描述一组连续值的分布，表示为概率密度函数probability density function
概率密度函数必须满足： 2 表示 Representation - 图18
另一种表示连续分布的方法是使用累积分布函数cumulative distribution function
累计分布函数的定义： 2 表示 Representation - 图19
分位数函数quantile function/逆累积分布函数inverse cumulative distribution function： 2 表示 Representation - 图20 是使得 2 表示 Representation - 图21 的 2 表示 Representation - 图22 值，即分位数函数返回累积分布值大于或等于 2 表示 Representation - 图23 的 2 表示 Representation - 图24 的最小值。

3. 联合分布

从联合分布中使用全概率公式计算边缘分布： 2 表示 Representation - 图25 或 2 表示 Representation - 图26
使用决策树表示联合概率分布比使用表格的方法更紧凑。

4. 条件分布

条件概率： 2 表示 Representation - 图27
贝叶斯规则： 2 表示 Representation - 图28
条件高斯模型：假如有一个连续变量 2 表示 Representation - 图29 和一个值为 2 表示 Representation - 图30 离散变量 2 表示 Representation - 图31 ，则定义为 2 表示 Representation - 图32 ，参数向量 2 表示 Representation - 图33 。
线性高斯模型： 2 表示 Representation - 图34 的线性高斯模型将连续变量 2 表示 Representation - 图35 上的分布表示为一个高斯分布，其均值是连续变量 2 表示 Representation - 图36 的值的线性函数，条件密度函数为 2 表示 Representation - 图37 ，参数 2 表示 Representation - 图38 。
条件线性高斯模型：结合了条件高斯模型和线性高斯模型的思想，能够处理连续变量对离散变量和连续变量的条件作用。假如 2 表示 Representation - 图39 和 2 表示 Representation - 图40 是连续变量， 2 表示 Representation - 图41 是值为 2 表示 Representation - 图42 的离散变量，条件密度函数为 2 表示 Representation - 图43 ，参数向量 2 表示 Representation - 图44 。
贝叶斯网络：用来表示联合概率分布，结构是通过一个由节点和有向边组成的有向无环图来定义的，每个节点对应一个变量。有向边连接一对节点，图中不允许有环。有向边表示直接的概率关系。与每个节点 2 表示 Representation - 图45 相关联的是一个条件分布 2 表示 Representation - 图46 ，其中 2 表示 Representation - 图47 表示 2 表示 Representation - 图48 的父节点。减少指定联合概率分布所需独立参数数量。
贝叶斯网络的链式法则：已知变量 2 表示 Representation - 图49 ，计算所有这些变量对 2 表示 Representation - 图50 值的特定赋值的概率 2 表示 Representation - 图51 ，其中 2 表示 Representation - 图52 是 2 表示 Representation - 图53 的父结点对其值的特定赋值。

例如，下图为5个二进制变量的贝叶斯网络结构，所有变量的域都是，和没有任何父级，则所有变量都为的概率为

条件独立：当且仅当 2 表示 Representation - 图60 时，变量 2 表示 Representation - 图61 和 2 表示 Representation - 图62 在给定 2 表示 Representation - 图63 时是条件独立的，记作 2 表示 Representation - 图64 。 2 表示 Representation - 图65 当且仅当 2 表示 Representation - 图66 。给定 2 表示 Representation - 图67 ，关于 2 表示 Representation - 图68 的信息不提供关于 2 表示 Representation - 图69 的其他信息，反之亦然。
d-分离(d-separation)：若满足以下任一条件，则 2 表示 Representation - 图70 与 2 表示 Representation - 图71 之间的一条路径被 2 表示 Representation - 图72 d-分离， 2 表示 Representation - 图73 是一组证据变量

路径包含节点链chain，且在中
路径包含叉fork，且在中
路径包含倒叉inverted fork，且不在中，的后代也不在中

如果 2 表示 Representation - 图85 和 2 表示 Representation - 图86 之间的所有路径都被 2 表示 Representation - 图87 d-分离，则 2 表示 Representation - 图88 和 2 表示 Representation - 图89 被 2 表示 Representation - 图90 d-分离，记作 2 表示 Representation - 图91
马尔可夫覆盖(Markov blanket)：指节点的最小集合，如果其值已知，则使 2 表示 Representation - 图92 条件独立于所有其他节点。由其父节点、子节点和子节点的其他父节点组成。