Bayesian Belief Networks - Training a Belief Network - 《机器学习》

How to construct a directed network?

The network topology (or “layout” of nodes and arcs) may be constructed by human experts or alternatively inferred from the data.
The network variables may be observable or hidden in all or some of the training tuples. The hidden data case is also referred to as missing values or incomplete data.
Several algorithms exist for learning the network topology from the training data given observable variables.
Human experts usually have a good grasp of the direct conditional dependencies that hold in the domain under analysis, and can design the network topology. Typically, these conditional dependencies are thought of causal relationships, e.g. that Smoking _causes _LungCancer. Experts must specify conditional probabilities for some of the nodes that participate in these direct dependencies (some of the CPTs). These probabilities can then be used to compute the remaining probability values.

How to learn the network?

If the network topology is known and all the variables are observable in the training data
- Computing the CPT entries is straightforward (very like naive Bayes)
When the network topology is given and some of the variables are hidden
- Several heuristic methods exist: many software packages provide solutions
- The gradient descent method _is well known: it works by treating each conditional probability as a _weight. It initialises the weights randomly up front and then iteratively adjusts each one by a small amount to raise the product of the computed probabilites of each datapoint in the training set. It stops when it is not increasing the product any more.
- This is computationally demanding, but it has the benefit that human domain knowledge is employed in the solution to design the network structure and thereby to assign initial probability values.