A two-class dataset that is not linearly separable. The outer ring (cyan) is class ‘0’, while the inner ring (red) is class ‘1’
- Linearly inseparable training data. Unlike the linearly separable data, it is not possible to draw a straight (linear) line to separate the classes.
- Basic SVM would not be able to find a feasible solution here.
- But there is a way to extend the linear approach in this case!
无法用一条直线去区分两个分类区域,基础向量机无法提供可行解,但是存在一种可扩展的线性方法。
Kernel Trick
思路:通过将数据映射到高维空间来获得线性分离
The idea is to obtain linear separation by mapping the data to a higher dimensional space. Let’s see the example first:> (Left) A dataset in
, not linearly separable. (Right) The same dataset transformed by the transformation:
Hyperplane (green plane) that linearly separates two classes in the higher dimensional space.
In the above example, we can train a linear SVM classifier that successfully finds a good decision boundary in .
However, we are given the dataset in . The challenge is to find a transformation
, such that the transformed dataset is linearly separable in
. 难点在于如何将低纬数据向高维转化。
, which after applied to every point in the original tuples yields the linearly separable dataset.
Assuming we have such a transformation , the new classification pipeline is as follows.