Learning algorithms (or learners) that build models built for classification and prediction are generally evaluated in the following ways. These ways are applied to the case of inventing new algorithms and wanting to assess them against others, or for selecting an algorithm for its suitability for a particular learning problem. Once an algorithm(s) has been selected and a model(s) built, these overarching principles may be revisited to choose which, if any, to put into practice.
- Accuracyoften on_benchmark_data sets so they can be compared with other learning algorithms
- Classifier accuracy: Predicting class label
- Predictor accuracy: Guessing value of predicted attributes
- Speed and complexity
- Time to construct the model (training time)
- Time to use the model (classification/prediction time)
- Worst case or average case theoretical complexity
- Scalability
- Efficiency in handling disk-based databases
- Potential for speed up by parallel computation
- Robustness
- Handling noise and outliers
- Interpretability
- Understanding and insight provided by the model
- Other measures
- goodness of rules
- decision tree size
- compactness or simplicity of model