Decompose
Converting 2014-09-20T20:45:40Z into categorical attributes like hour_of_the_day, part_of_the_day, etc
Discretization 离散化
Continous Features
Typically data is discretized into partitions of K equal lengths/width (equal intervals) or k% of the total data (equal frequencies).
Categorical Features
Values for categorical feautures may be combined, particularly when there’s few samples for some categories.
Reframe Numerical Quantities
Changing from grams to kg, and losing detail might be both wanted and efficient for calculation.
Crossing
Creating new features as a combination of existing features. Could be multiplying numerical features, or combining categorical variables. This is a great way to add domain expertise knowledge to the dataset.
