Let attribute A be a continuous-valued attribute
To determine the best split point for A
- Sort the values of A in increasing order
- Typically, the midpoint between each pair of adjacent values is considered as a possible split point
- (a+a)/2 is the midpoint between the values of a and a
- Of these, the point with the minimum expected information requirement for A,
is selected as the split-point for A
Then Split:
D1 is the set of tuples in D satisfying A ≤ split-point, and D2 is the set of tuples in D satisfying A > split-point
This method can also be used for ordinal attributes with many values (where treating them simply as nominals may cause too much branching).