Java 类名:com.alibaba.alink.operator.batch.statistics.CorrelationBatchOp
Python 类名:CorrelationBatchOp

功能介绍

相关系数算法用于计算一个矩阵中每一列之间的相关系数,范围在[-1,1]之间。计算的时候,count数按两列间同时非空的元素个数计算,两两列之间可能不同。

参数说明

| 名称 | 中文名称 | 描述 | 类型 | 是否必须? | 取值范围 | 默认值 | | —- | —- | —- | —- | —- | —- | —- |

| method | 方法 | 方法:包含”PEARSON”和”SPEARMAN”两种,PEARSON。 | String | | “PEARSON”, “SPEARMAN” | “PEARSON” |

| selectedCols | 选中的列名数组 | 计算列对应的列名列表 | String[] | | | null |

代码示例

Python 代码

  1. from pyalink.alink import *
  2. import pandas as pd
  3. useLocalEnv(1)
  4. df = pd.DataFrame([
  5. [0.0,0.0,0.0],
  6. [0.1,0.2,0.1],
  7. [0.2,0.2,0.8],
  8. [9.0,9.5,9.7],
  9. [9.1,9.1,9.6],
  10. [9.2,9.3,9.9]])
  11. source = BatchOperator.fromDataframe(df, schemaStr='x1 double, x2 double, x3 double')
  12. corr = CorrelationBatchOp()\
  13. .setSelectedCols(["x1","x2","x3"])
  14. corr = source.link(corr).collectCorrelation()
  15. print(corr)

Java 代码

  1. import org.apache.flink.types.Row;
  2. import com.alibaba.alink.operator.batch.BatchOperator;
  3. import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
  4. import com.alibaba.alink.operator.batch.statistics.CorrelationBatchOp;
  5. import com.alibaba.alink.operator.common.statistics.basicstatistic.CorrelationResult;
  6. import org.junit.Test;
  7. import java.util.Arrays;
  8. import java.util.List;
  9. public class CorrelationBatchOpTest {
  10. @Test
  11. public void testCorrelationBatchOp() throws Exception {
  12. List <Row> df = Arrays.asList(
  13. Row.of(0.0, 0.0, 0.0),
  14. Row.of(0.1, 0.2, 0.1),
  15. Row.of(0.2, 0.2, 0.8),
  16. Row.of(9.0, 9.5, 9.7),
  17. Row.of(9.1, 9.1, 9.6)
  18. );
  19. BatchOperator <?> source = new MemSourceBatchOp(df, "x1 double, x2 double, x3 double");
  20. CorrelationBatchOp corr = new CorrelationBatchOp()
  21. .setSelectedCols("x1", "x2", "x3");
  22. CorrelationResult correlationResult = source.link(corr).collectCorrelation();
  23. System.out.println(correlationResult);
  24. }
  25. }

运行结果

| colName | x1 | x2 | x3 | | —- | —- | —- | —- |

| x1 | 1.0000 | 0.9994 | 0.9990 |

| x2 | 0.9994 | 1.0000 | 0.9986 |

| x3 | 0.9990 | 0.9986 | 1.0000 |