Zero-probability problem
    Naïve Bayesian prediction requires each class conditional probability to be non-zero, as otherwise the predicted probability will be zero.

    Example
    Let’s assume that we extract following two tables for _student _and _credit _attributes from a customer history, where each entry represents a number of customers:

    Buy computer \ Student Yes No
    Yes 0 5
    No 3 7
    Buy computer \ credit Fair Excellent
    Yes 4 1
    No 6 4

    Using naive Bayes, let’s classify the probability of a student with fair credit buying a computer. First, we need to compute the likelihood:
    Laplacian Correction - 图1
    Laplacian Correction - 图2
    Therefore, the classifier will classify that the student will not buy a computer irrespective of the prior. This is because no student has bought a computer ever before. In other words, the likelihood of student buying a computer: Laplacian Correction - 图3, indicates irrespective of the other attributes, the classifier will always classify a student tuple as not buy a computer. During the classification of an unlabelled tuple, all the other attributes have no effect if the student _attribute is _Yes. This is not wrong, but inconvenient, as in some cases, the other attributes may have a different opinion to contribute to the classification of the tuple.
    Laplace correction
    To avoid the zero probability in the likelihood, we can simply add a small constant to the summary table as follows:

    Buy computer \ Student Yes No
    Yes 0+\alpha 5+\alpha
    No 3 7

    If we let Laplacian Correction - 图6, which is the usual value, then the likelihoods of naive Bayes are:
    Laplacian Correction - 图7
    Using the Laplacian correction with Laplacian Correction - 图8 of 1, we pretend that we have 1 more tuple for each possible value for Student (i.e., Yes and No, here) but we only pretend this while computing the likelihood factors for the attribute and class combination which has a zero count in the data for at least one of its values.
    Likelihood for alternative(non-zero count) values of the affected attribute are also affected, but this will come into play when we are predicting a _different _customer at a different time: e.g.

    Laplacian Correction - 图9
    The likelihood for the other class (for the same student with fair credit) is unchanged as before:

    Laplacian Correction - 图10
    The “corrected” probability estimates are close to their “uncorrected” counterparts, yet the zero probability value is avoided.