准备json数据

users.json

  1. [{"name":"张三" ,"age":18} ,{"name":"李四" ,"age":15}]

注意,必须得是一行,不能是换行的.

Maven依赖

  1. <dependencies>
  2. <dependency>
  3. <groupId>org.apache.spark</groupId>
  4. <artifactId>spark-sql_2.11</artifactId>
  5. <version>2.1.1</version>
  6. </dependency>
  7. </dependencies>
  8. <build>
  9. <plugins>
  10. <!-- 打包插件, 否则 scala 类不会编译并打包进去 -->
  11. <plugin>
  12. <groupId>net.alchim31.maven</groupId>
  13. <artifactId>scala-maven-plugin</artifactId>
  14. <version>3.4.6</version>
  15. <executions>
  16. <execution>
  17. <goals>
  18. <goal>compile</goal>
  19. <goal>testCompile</goal>
  20. </goals>
  21. </execution>
  22. </executions>
  23. </plugin>
  24. </plugins>
  25. </build>

代码

  1. import org.apache.spark.sql.SparkSession
  2. object CreateDataFrame {
  3. def main(args: Array[String]): Unit = {
  4. val sparkSession = SparkSession.builder()
  5. .appName("CreateDataFrame")
  6. .master("local[2]")
  7. .getOrCreate()
  8. val dataFrame = sparkSession.read.json("E:\\ZJJ_SparkSQL\\demo01\\src\\main\\resources\\users.json")
  9. dataFrame.createOrReplaceTempView("user")
  10. dataFrame.cache()
  11. sparkSession.sql("select * from user").show()
  12. sparkSession.stop()
  13. }
  14. }

结果

  1. +---+----+
  2. |age|name|
  3. +---+----+
  4. | 18| 张三|
  5. | 15| 李四|
  6. +---+----+

码云地址

https://gitee.com/crow1/ZJJ_SparkSQL/blob/master/demo01/src/main/java/com/CreateDataFrame.scala