决策树原理
好处
缺点
对数据的噪音敏感
Solution
Ensemble trees to reduce bias and variance.
- Random forest : trees trained in parallel with randomness
- Gradient boosting: train on sequential residuals
Trees are widely used in industry
simple, easy-to-tune,often give a satisfied result
决策树的三种基本类型
建立决策树的关键,即在当前状态下选择哪个属性作为分类依据。
例子
针对左侧数据,利用信息增益比生成决策树
values 代表样例类别个数