异常检测 - IForest异常检测 (IForestOutlierBatchOp) - 《Alink 1.5.6 文档 - 帮助手册 - 教程》

功能介绍
- 文献或出处
参数说明
代码示例

Java 类名：com.alibaba.alink.operator.batch.outlier.IForestOutlierBatchOp
Python 类名：IForestOutlierBatchOp

功能介绍

iForest 可以识别数据中异常点，在异常检测领域有比较好的效果。算法使用 sub-sampling 方法，降低了算法的计算复杂度。

文献或出处

Isolation Forest
参数说明
| 名称 | 中文名称 | 描述 | 类型 | 是否必须？ | 取值范围 | 默认值 | | —- | —- | —- | —- | —- | —- | —- |

代码示例

Python 代码

import pandas as pd
df = pd.DataFrame([
[0.73, 0],
[0.24, 0],
[0.63, 0],
[0.55, 0],
[0.73, 0],
[0.41, 0]
])
dataOp = BatchOperator.fromDataframe(df, schemaStr='val double, label int')
outlierOp = IForestOutlierBatchOp()\
.setFeatureCols(["val"])\
.setOutlierThreshold(3.0)\
.setPredictionCol("pred")\
.setPredictionDetailCol("pred_detail")
evalOp = EvalOutlierBatchOp()\
.setLabelCol("label")\
.setPredictionDetailCol("pred_detail")\
.setOutlierValueStrings(["1"]);
metrics = dataOp\
.link(outlierOp)\
.link(evalOp)\
.collectMetrics()
print(metrics)

Java 代码

package com.alibaba.alink.operator.batch.outlier;
import com.alibaba.alink.operator.batch.BatchOperator;
import com.alibaba.alink.operator.batch.evaluation.EvalOutlierBatchOp;
import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.operator.common.evaluation.OutlierMetrics;
import com.alibaba.alink.testutil.AlinkTestBase;
import org.junit.Assert;
import org.junit.Test;
public class IForestOutlierBatchOpTest extends AlinkTestBase {
    @Test
    public void test() throws Exception {
        BatchOperator <?> data = new MemSourceBatchOp(
            new Object[][] {
                {0.73, 0},
                {0.24, 0},
                {0.63, 0},
                {0.55, 0},
                {0.73, 0},
                {0.41, 0},
            },
            new String[]{"val", "label"});
        BatchOperator <?> outlier = new IForestOutlierBatchOp()
            .setFeatureCols("val")
            .setOutlierThreshold(3.0)
            .setPredictionCol("pred")
            .setPredictionDetailCol("pred_detail");
        EvalOutlierBatchOp eval = new EvalOutlierBatchOp()
            .setLabelCol("label")
            .setPredictionDetailCol("pred_detail")
            .setOutlierValueStrings("1");
        OutlierMetrics metrics = data
            .link(outlier)
            .link(eval)
            .collectMetrics();
        Assert.assertEquals(1.0, metrics.getAccuracy(), 10e-6);
    }
}

运行结果

———————————————— Metrics: ————————————————
Outlier values: [1] Normal values: [0]
Auc:NaN Accuracy:1 Precision:1 Recall:0 F1:0

| Pred\Real | Outlier | Normal | | —- | —- | —- |

| Outlier | 0 | 0 |

| Normal | 0 | 6 |