一、创建Maven项目

image.png
image.png

二、配置Maven

右键 pom.xml ,选择Maven->打开’settings.xml’
image.png
修改配置为:

  1. <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
  2. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  3. xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
  4. http://maven.apache.org/xsd/settings-1.0.0.xsd">
  5. <localRepository/>
  6. <interactiveMode/>
  7. <usePluginRegistry/>
  8. <offline/>
  9. <pluginGroups/>
  10. <servers/>
  11. <mirrors>
  12. <mirror>
  13. <id>aliyunmaven</id>
  14. <mirrorOf>central</mirrorOf>
  15. <name>阿里云公共仓库</name>
  16. <url>https://maven.aliyun.com/repository/central</url>
  17. </mirror>
  18. <mirror>
  19. <id>repo1</id>
  20. <mirrorOf>central</mirrorOf>
  21. <name>central repo</name>
  22. <url>http://repo1.maven.org/maven2/</url>
  23. </mirror>
  24. <mirror>
  25. <id>aliyunmaven</id>
  26. <mirrorOf>apache snapshots</mirrorOf>
  27. <name>阿里云阿帕奇仓库</name>
  28. <url>https://maven.aliyun.com/repository/apache-snapshots</url>
  29. </mirror>
  30. </mirrors>
  31. <proxies/>
  32. <activeProfiles/>
  33. <profiles>
  34. <profile>
  35. <repositories>
  36. <repository>
  37. <id>aliyunmaven</id>
  38. <name>aliyunmaven</name>
  39. <url>https://maven.aliyun.com/repository/public</url>
  40. <layout>default</layout>
  41. <releases>
  42. <enabled>true</enabled>
  43. </releases>
  44. <snapshots>
  45. <enabled>true</enabled>
  46. </snapshots>
  47. </repository>
  48. <repository>
  49. <id>MavenCentral</id>
  50. <url>http://repo1.maven.org/maven2/</url>
  51. </repository>
  52. <repository>
  53. <id>aliyunmavenApache</id>
  54. <url>https://maven.aliyun.com/repository/apache-snapshots</url>
  55. </repository>
  56. </repositories>
  57. </profile>
  58. </profiles>
  59. </settings>

三、配置pom.xml

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <project xmlns="http://maven.apache.org/POM/4.0.0"
  3. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  4. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  5. <modelVersion>4.0.0</modelVersion>
  6. <groupId>org.example</groupId>
  7. <artifactId>Demo3</artifactId>
  8. <version>1.0-SNAPSHOT</version>
  9. <properties>
  10. <maven.compiler.source>8</maven.compiler.source>
  11. <maven.compiler.target>8</maven.compiler.target>
  12. <scala.version>2.12.10</scala.version>
  13. <spark.version>3.1.1</spark.version>
  14. </properties>
  15. <dependencies>
  16. <!--scala-->
  17. <dependency>
  18. <groupId>org.scala-lang</groupId>
  19. <artifactId>scala-library</artifactId>
  20. <version>${scala.version}</version>
  21. </dependency>
  22. <!--spark-->
  23. <dependency>
  24. <groupId>org.apache.spark</groupId>
  25. <artifactId>spark-core_2.12</artifactId>
  26. <version>${spark.version}</version>
  27. </dependency>
  28. <!--hadoop-->
  29. <dependency>
  30. <groupId>org.apache.hadoop</groupId>
  31. <artifactId>hadoop-client</artifactId>
  32. <version>2.7.0</version>
  33. </dependency>
  34. </dependencies>
  35. </project>

右键pom.xml,选择 Maven -> 重新加载项目
image.png

四、编写Spark程序

  1. 在main包下新建scala目录,并调整为suorces root

image.png

  1. 在test包下新建scala目录,并调整为test suorces root

image.png

  1. 为项目添加scala支持

右键项目名称,选择->添加框架支持
image.png

  1. 在main/scala下添加软件包,然后创建Scala类->Object

image.png

  1. 词频统计Demo ```scala package org.simple

import org.apache.spark.{SparkContext, SparkConf} import org.apache.log4j.{Level, Logger}

object Demo { def main(args: Array[String]): Unit = { Logger.getLogger(“org”).setLevel(Level.ERROR)

val sparkConf = new SparkConf().setAppName("WordCount").setMaster("local")
val sc = new SparkContext(sparkConf)

// 词频统计
val dataRDD = sc.textFile("D:\\SparkProject\\data\\test.txt")
  .flatMap(_.split(" "))
  .map(x => (x, 1))
  .reduceByKey((a, b) => (a + b))
dataRDD.foreach(println)

} }


<a name="DB4Bp"></a>
## 五、运行程序

1. 直接运行

右键直接运行

2. 打包运行

选择文件->项目结构<br />    ![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895595499-b8cf957f-5899-4019-95fc-aa22255fcc5a.png#align=left&display=inline&height=349&margin=%5Bobject%20Object%5D&name=image.png&originHeight=697&originWidth=997&size=123284&status=done&style=none&width=498.5)<br />然后如图所示进行如下操作<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895671454-2eb3dfa7-8755-4615-b4ad-d77fa6f6a9a8.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=80054&status=done&style=none&width=627)<br />选择主类<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895751020-befd14c8-3222-4d35-aaed-b3f115598983.png#align=left&display=inline&height=319&margin=%5Bobject%20Object%5D&name=image.png&originHeight=638&originWidth=821&size=34746&status=done&style=none&width=410.5)<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895828390-6db7b24c-28fa-48bc-894f-9fc879cfc755.png#align=left&display=inline&height=243&margin=%5Bobject%20Object%5D&name=image.png&originHeight=486&originWidth=619&size=32882&status=done&style=none&width=309.5)<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895882712-fe4bccfb-b3bb-495f-93e9-90d946fedfef.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=153115&status=done&style=none&width=627)<br />移除不需要的依赖包,只保留最后一个<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895940937-6688d579-df09-4e33-ba12-8d5926a9b898.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=158637&status=done&style=none&width=627)<br />构建<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617896003032-0b339a89-7af0-4fef-a24b-796395aaf515.png#align=left&display=inline&height=217&margin=%5Bobject%20Object%5D&name=image.png&originHeight=434&originWidth=885&size=77523&status=done&style=none&width=442.5)<br />上传到spark集群后运行
```bash
spark-submit --class org.simple.Demo --master spark://spark0:7077 ./SparkDemo.jar