Concept and Mechanism

    • Bayesian belief networks—probabilistic graphical models, which unlike naive Bayesian classifiers allow the representation of dependencies among subsets of attributes. 贝叶斯信念网络——概率图形模型,与朴素贝叶斯分类器不同,它允许表示属性子集之间的依赖关系。
    • The naive Bayesian classifier makes the assumption of class conditional independence, that is, given the class label of a tuple, the values of the attributes are assumed to be conditionally independent of one another.
    • In practice, however, dependencies can exist between variables (attributes).
    • Bayesian belief networks provide a graphical model of causal relationships between attributes.
    • A belief network is defined by two components
      • a directed acyclic graph
        • Node: represents a random variable (attribute), can be discrete- or continuous-valued
        • Edge: represents a probabilistic dependence, If an arc is drawn from a node Y to a node Z, then Y is a parent or immediate predecessor of Z.
      • a set of conditional probability tables

    Example
    image.png
    Simple Bayesian belief network with six boolean variables. (a) A proposed causal(graphical) model, represented by a directed acyclic graph. (b) The conditional probability table for the values of the variable LungCancer (LC) showing each possible combination of the values of its parent nodes, FamilyHistory (FH) and Smoker (S).
    Causal relations:

    • having lung cancer is influenced by a person’s family history of lung cancer, as well as whether or not the person is a smoker.
    • Variable PositiveXRay is independent of whether the patient has a family history of lung cancer or is a smoker, given that we know the patient has lung cancer.
      • Once we know the outcome of the variable LungCancer, then the variables FamilyHistory and Smoker do not provide any additional information regarding PositiveXRay.
    • Variable LungCancer is conditionally independent of Emphysema, given its parents, FamilyHistory and Smoker.

    Conditional probability table (CPT):
    The CPT for a variable Bayesian Belief Networks - 图2 specifies the conditional distribution Bayesian Belief Networks - 图3, where Bayesian Belief Networks - 图4 are the parents of Bayesian Belief Networks - 图5. Figure (b) shows a CPT for the variable LungCancer. The conditional probability for each known value of LungCancer is given for each possible combination of the values of its parents. For instance, we can interpret the upper leftmost and bottom rightmost entries as
    Bayesian Belief Networks - 图6
    Bayesian Belief Networks - 图7
    More formally, let Bayesian Belief Networks - 图8 be a data tuple described by the variables. Recall that each variable is conditionally independent of its nondescendants in the network graph, given its parents. This allows the network to provide a complete representation of the existing joint probability distribution with the following equation:
    Bayesian Belief Networks - 图9,
    where Bayesian Belief Networks - 图10 is the probability of a particular combination of values of Bayesian Belief Networks - 图11, and the values for Bayesian Belief Networks - 图12 correspond to the entries in the CPT for Bayesian Belief Networks - 图13.