Bayes Classifiers 贝叶斯分类器 - Bayesian Belief Networks - 《机器学习》

Concept and Mechanism

Bayesian belief networks—probabilistic graphical models, which unlike naive Bayesian classifiers allow the representation of dependencies among subsets of attributes. 贝叶斯信念网络——概率图形模型，与朴素贝叶斯分类器不同，它允许表示属性子集之间的依赖关系。
The naive Bayesian classifier makes the assumption of class conditional independence, that is, given the class label of a tuple, the values of the attributes are assumed to be conditionally independent of one another.
In practice, however, dependencies can exist between variables (attributes).
Bayesian belief networks provide a graphical model of causal relationships between attributes.
A belief network is defined by two components
- a directed acyclic graph
  - Node: represents a random variable (attribute), can be discrete- or continuous-valued
  - Edge: represents a probabilistic dependence, If an arc is drawn from a node Y to a node Z, then Y is a parent or immediate predecessor of Z.
- a set of conditional probability tables

Example

Simple Bayesian belief network with six boolean variables. (a) A proposed causal(graphical) model, represented by a directed acyclic graph. (b) The conditional probability table for the values of the variable LungCancer (LC) showing each possible combination of the values of its parent nodes, FamilyHistory (FH) and Smoker (S).
Causal relations:

having lung cancer is influenced by a person’s family history of lung cancer, as well as whether or not the person is a smoker.
Variable PositiveXRay is independent of whether the patient has a family history of lung cancer or is a smoker, given that we know the patient has lung cancer.
- Once we know the outcome of the variable LungCancer, then the variables FamilyHistory and Smoker do not provide any additional information regarding PositiveXRay.
Variable LungCancer is conditionally independent of Emphysema, given its parents, FamilyHistory and Smoker.

Conditional probability table (CPT):
The CPT for a variable specifies the conditional distribution , where are the parents of . Figure (b) shows a CPT for the variable LungCancer. The conditional probability for each known value of LungCancer is given for each possible combination of the values of its parents. For instance, we can interpret the upper leftmost and bottom rightmost entries as

More formally, let be a data tuple described by the variables. Recall that each variable is conditionally independent of its nondescendants in the network graph, given its parents. This allows the network to provide a complete representation of the existing joint probability distribution with the following equation:
,
where is the probability of a particular combination of values of , and the values for correspond to the entries in the CPT for .