Gini index
    is used in CART and IBM Intelligent miner decision tree learners

    • All attributes are assumed nominal

    If a data set Gini index - 图1 contains examples from Gini index - 图2 classes, gini index, Gini index - 图3, measures the impurity of Gini index - 图4 and is defined as

    Gini index - 图5 where Gini index - 图6 is the relative frequency of classGini index - 图7 in Gini index - 图8

    If a data set Gini index - 图9 is split on attribute Gini index - 图10 into two subsets Gini index - 图11 and Gini index - 图12, the gini index Gini index - 图13 is defined as the size-weighted sum of the impurity of each partition:

    Gini index - 图14


    To split a node in the tree:

    • Enumerate all the possible ways of splitting all the possible attributes
    • The attribute split that provides the smallest gini(D) (i.e the greatest purity) is chosen to split the node 选择提供最小基尼系数(D)(即最大纯度)的属性分割来分割节点

    Example (continued from previous)
    D has 9 tuples in class buyscomputer = “yes” and 5 in “no”
    Then Gini index - 图15
    Now consider the attribute _income.
    Partition D into 10 objects in D1 with income in {low, medium} and 4 objects in D2 _with income in {high}

    We have
    [Gini index - 图16](https://wattlecourses.anu.edu.au/filter/tex/displaytex.php?texexp=gini
    %7Bincome%20%5Cin%20%5C%7Blow%2C%20medium%5C%7D%7D%28D%29%20)
    Gini index - 图17
    Gini index - 图18
    Gini index - 图19
    Gini index - 图20

    Similarly, gini{low,high} is 0.458; and gini{medium,high} is 0.450.
    Thus, we split on the {low,medium} (and the other partition is {high}) since it has the lowest Gini index

    When attributes are continuous or ordinal, the method for selecting the midpoint between each pair of adjacent values (described earlier) may be used.