1.1 配置父项目Spark

  1. 创建Maven项目Spark,删除src和test文件夹作为父项目
  2. 添加scala支持
  3. 修改pom

    1. <dependencies>
    2. <dependency>
    3. <groupId>org.apache.spark</groupId>
    4. <artifactId>spark-core_2.12</artifactId>
    5. <version>3.0.0</version>
    6. </dependency>
    7. <dependency>
    8. <groupId>log4j</groupId>
    9. <artifactId>log4j</artifactId>
    10. <version>1.2.17</version>
    11. </dependency>
    12. <dependency>
    13. <groupId>org.projectlombok</groupId>
    14. <artifactId>lombok</artifactId>
    15. <version>1.16.10</version>
    16. </dependency>
    17. </dependencies>

    1.2 配置WordCount模组

  4. 配置pom

    <build>
     <finalName>WordCount</finalName>
     <plugins>
         <plugin>
             <groupId>net.alchim31.maven</groupId>
             <artifactId>scala-maven-plugin</artifactId>
             <version>3.2.2</version>
             <executions>
                 <execution>
                     <goals>
                         <goal>compile</goal>
                         <goal>testCompile</goal>
                     </goals>
                 </execution>
             </executions>
         </plugin>
     </plugins>
    </build>
    
  5. 编写代码 ```scala import org.apache.spark.{SparkConf, SparkContext}

object WordCount { def main(args: Array[String]): Unit = { //1.创建Spark配置conf,依赖conf生成SparkContext环境 val conf = new SparkConf() conf.setAppName(“WordCount”) conf.setMaster(“local[*]”) //2.创建SparkContext环境 val sc = new SparkContext(conf) //3.创建RDD val rdd = sc.textFile(“text.txt”) //4.处理 val res = rdd.flatMap(.split(“ “)).map((, 1)).reduceByKey(+,1).sortBy(_._2) //5.输出 res.foreach(println) } } ```