These enhancements may be embedded within the decision tree induction algorithm

    • Handle missing attribute values
      • Assign the most common value of the attribute 指定最常见的属性值
      • Assign probability to each of the possible values 为每个可能值分配概率
    • Attribute construction
      • Create new attributes based on existing ones that are sparsely represented 基于稀疏表示的现有属性创建新属性
      • This reduces fragmentation, repetition, and replication 这减少了碎片、重复和复制
    • Continuous target variable 连续目标变量
      • In this case the tree is called a regression tree, the leaf node classes are represented by their mean values, and the tree performs prediction (using that mean value) rather than classification. 在这种情况下,树被称为回归树,叶节点类由它们的平均值表示,树执行预测(使用平均值)而不是分类。
    • Probabilistic classifier 概率分类器
      • Instead of majority voting to assign a class label to a leaf node, the proportion of training data objects of each class in the leaf node can be interpreted to as the _probability _of the class, and this probability can be assigned to the classification for unknown objects falling in that node at use-time. 不是通过多数投票将类别标签分配给叶节点,而是可以将叶节点中每个类别的训练数据对象的比例解释为该类别的概率,并且可以将该概率分配给在使用时落入该节点的未知对象的分类。