Decompose

Converting 2014-09-20T20:45:40Z into categorical attributes like hour_of_the_day, part_of_the_day, etc

Discretization 离散化

Continous Features

Typically data is discretized into partitions of K equal lengths/width (equal intervals) or k% of the total data (equal frequencies).

Categorical Features

Values for categorical feautures may be combined, particularly when there’s few samples for some categories.

Reframe Numerical Quantities

Changing from grams to kg, and losing detail might be both wanted and efficient for calculation.

Crossing

Creating new features as a combination of existing features. Could be multiplying numerical features, or combining categorical variables. This is a great way to add domain expertise knowledge to the dataset.