7 steps:
- Data Collection
- Data Preparation
- Build Model
- Train Model
- Evaluation
- Tune
- Predict
常用算法
SUPERVISED | CLASSIFICATION | Support Vector Machines |
---|---|---|
Discriminant Analysis | ||
Naive Bayes | ||
Nearest Neighbor | ||
REGRESSION | Linear Regression, GLM | |
SVR, GPR | ||
Ensemble Methods | ||
Decision Trees | ||
Neural Networks | ||
UNSUPERVISED |
| CLUSTERING | K-Means, K-Medoids, Fuzzy C-Means | | | | Hierarchical | | | | Gaussian Mixture | | | | Neural Networks | | | | Hidden Markov | | | DIMENSIONALITY REDUCTION | Randomized PCA | | | | Isomap Spectral Embedding | | | | Kernel Approximation | | | | LLE |
算法选择
算法图例
The most popular regression algorithms are:
- Ordinary Least Squares Regression (OLSR)
- Linear Regression
- Logistic Regression
- Stepwise Regression
- Multivariate Adaptive Regression Splines (MARS)
- Locally Estimated Scatterplot Smoothing (LOESS)
The most popular instance-based algorithms are:
- k-Nearest Neighbour (kNN)
- Learning Vector Quantization (LVQ)
- Self-Organizing Map (SOM)
- Locally Weighted Learning (LWL)
The most popular regularization algorithms are:
- Ridge Regression
- Least Absolute Shrinkage and Selection Operator (LASSO)
- Elastic Net
- Least-Angle Regression (LARS)
The most popular decision tree algorithms are:
- Classification and Regression Tree (CART)
- Iterative Dichotomiser 3 (ID3)
- C4.5 and C5.0 (different versions of a powerful approach)
- Chi-squared Automatic Interaction Detection (CHAID)
- Decision Stump
- M5
- Conditional Decision Trees
The most popular Bayesian algorithms are:
- Naive Bayes
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Averaged One-Dependence Estimators (AODE)
- Bayesian Belief Network (BBN)
- Bayesian Network (BN)
The most popular clustering algorithms are:
- k-Means
- k-Medians
- Expectation Maximisation (EM)
- Hierarchical Clustering
The most popular artificial neural network algorithms are:
- Perceptron
- Back-Propagation
- Hopfield Network
- Radial Basis Function Network (RBFN)
The most popular deep learning algorithms are:
- Deep Boltzmann Machine (DBM)
- Deep Belief Networks (DBN)
- Convolutional Neural Network (CNN)
- Stacked Auto-Encoders
The most popular deep Ensemble Algorithms are:
- Boosting
- Bootstrapped Aggregation (Bagging)
- AdaBoost
- Stacked Generalization (blending)
- Gradient Boosting Machines (GBM)
- Gradient Boosted Regression Trees (GBRT)
- Random Forest
Markov