Java 类名:com.alibaba.alink.operator.batch.recommendation.FlattenKObjectBatchOp
Python 类名:FlattenKObjectBatchOp

功能介绍

将推荐结果从json序列化格式转为table格式。

参数说明

| 名称 | 中文名称 | 描述 | 类型 | 是否必须? | 取值范围 | 默认值 | | —- | —- | —- | —- | —- | —- | —- |

| outputCols | 输出结果列列名数组 | 输出结果列列名数组,必选 | String[] | ✓ | | |

| selectedCol | 选中的列名 | 计算列对应的列名 | String | ✓ | 所选列类型为 [STRING] | |

| outputColTypes | 输出结果列列类型数组 | 输出结果列类型数组 | String[] | | | null |

| reservedCols | 算法保留列名 | 算法保留列 | String[] | | | null |

代码示例

Python 代码

  1. from pyalink.alink import *
  2. import pandas as pd
  3. useLocalEnv(1)
  4. df_data = pd.DataFrame([
  5. [1, 1, 0.6],
  6. [2, 2, 0.8],
  7. [2, 3, 0.6],
  8. [4, 1, 0.6],
  9. [4, 2, 0.3],
  10. [4, 3, 0.4],
  11. ])
  12. data = BatchOperator.fromDataframe(df_data, schemaStr='user bigint, item bigint, rating double')
  13. jsonData = Zipped2KObjectBatchOp()\
  14. .setGroupCol("user")\
  15. .setObjectCol("item")\
  16. .setInfoCols(["rating"])\
  17. .setOutputCol("recomm")\
  18. .linkFrom(data)\
  19. .lazyPrint(-1);
  20. recList = FlattenKObjectBatchOp()\
  21. .setSelectedCol("recomm")\
  22. .setOutputCols(["item", "rating"])\
  23. .setOutputColTypes(["long", "double"])\
  24. .setReservedCols(["user"])\
  25. .linkFrom(jsonData)\
  26. .lazyPrint(-1);
  27. BatchOperator.execute();

Java 代码

  1. import org.apache.flink.types.Row;
  2. import com.alibaba.alink.operator.batch.BatchOperator;
  3. import com.alibaba.alink.operator.batch.recommendation.FlattenKObjectBatchOp;
  4. import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
  5. import com.alibaba.alink.operator.common.recommendation.Zipped2KObjectBatchOp;
  6. import org.junit.Test;
  7. import java.util.Arrays;
  8. import java.util.List;
  9. public class FlattenKObjectBatchOpTest {
  10. @Test
  11. public void testFlattenKObjectBatchOp() throws Exception {
  12. List <Row> df_data = Arrays.asList(
  13. Row.of(1, 1, 0.6),
  14. Row.of(2, 2, 0.8),
  15. Row.of(2, 3, 0.6),
  16. Row.of(4, 1, 0.6),
  17. Row.of(4, 2, 0.3),
  18. Row.of(4, 3, 0.4)
  19. );
  20. BatchOperator <?> data = new MemSourceBatchOp(df_data, "user int, item int, rating double");
  21. BatchOperator <?> jsonData = new Zipped2KObjectBatchOp()
  22. .setGroupCol("user")
  23. .setObjectCol("item")
  24. .setInfoCols("rating")
  25. .setOutputCol("recomm")
  26. .linkFrom(data)
  27. .lazyPrint(-1);
  28. BatchOperator <?> recList = new FlattenKObjectBatchOp()
  29. .setSelectedCol("recomm")
  30. .setOutputCols("item", "rating")
  31. .setOutputColTypes("long", "double")
  32. .setReservedCols("user")
  33. .linkFrom(jsonData)
  34. .lazyPrint(-1);
  35. BatchOperator.execute();
  36. }
  37. }

运行结果

| user | recomm | | —- | —- |

| 1 | {“item”:”[1]”,”rating”:”[0.6]”} |

| 4 | {“item”:”[1,2,3]”,”rating”:”[0.6,0.3,0.4]”} |

| 2 | {“item”:”[2,3]”,”rating”:”[0.8,0.6]”} |

| user | item | rating | | —- | —- | —- |

| 1 | 1 | 0.6000 |

| 4 | 1 | 0.6000 |

| 4 | 2 | 0.3000 |

| 4 | 3 | 0.4000 |

| 2 | 2 | 0.8000 |

| 2 | 3 | 0.6000 |