1.1 配置父项目Spark
- 创建Maven项目Spark,删除src和test文件夹作为父项目
- 添加scala支持
修改pom
<dependencies><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_2.12</artifactId><version>3.0.0</version></dependency><dependency><groupId>log4j</groupId><artifactId>log4j</artifactId><version>1.2.17</version></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><version>1.16.10</version></dependency></dependencies>
1.2 配置WordCount模组
配置pom
<build> <finalName>WordCount</finalName> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> </plugins> </build>编写代码 ```scala import org.apache.spark.{SparkConf, SparkContext}
object WordCount { def main(args: Array[String]): Unit = { //1.创建Spark配置conf,依赖conf生成SparkContext环境 val conf = new SparkConf() conf.setAppName(“WordCount”) conf.setMaster(“local[*]”) //2.创建SparkContext环境 val sc = new SparkContext(conf) //3.创建RDD val rdd = sc.textFile(“text.txt”) //4.处理 val res = rdd.flatMap(.split(“ “)).map((, 1)).reduceByKey(+,1).sortBy(_._2) //5.输出 res.foreach(println) } } ```
