一、创建Maven项目
二、配置Maven
右键 pom.xml ,选择Maven->打开’settings.xml’
修改配置为:
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
http://maven.apache.org/xsd/settings-1.0.0.xsd">
<localRepository/>
<interactiveMode/>
<usePluginRegistry/>
<offline/>
<pluginGroups/>
<servers/>
<mirrors>
<mirror>
<id>aliyunmaven</id>
<mirrorOf>central</mirrorOf>
<name>阿里云公共仓库</name>
<url>https://maven.aliyun.com/repository/central</url>
</mirror>
<mirror>
<id>repo1</id>
<mirrorOf>central</mirrorOf>
<name>central repo</name>
<url>http://repo1.maven.org/maven2/</url>
</mirror>
<mirror>
<id>aliyunmaven</id>
<mirrorOf>apache snapshots</mirrorOf>
<name>阿里云阿帕奇仓库</name>
<url>https://maven.aliyun.com/repository/apache-snapshots</url>
</mirror>
</mirrors>
<proxies/>
<activeProfiles/>
<profiles>
<profile>
<repositories>
<repository>
<id>aliyunmaven</id>
<name>aliyunmaven</name>
<url>https://maven.aliyun.com/repository/public</url>
<layout>default</layout>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<id>MavenCentral</id>
<url>http://repo1.maven.org/maven2/</url>
</repository>
<repository>
<id>aliyunmavenApache</id>
<url>https://maven.aliyun.com/repository/apache-snapshots</url>
</repository>
</repositories>
</profile>
</profiles>
</settings>
三、配置pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>Demo3</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<scala.version>2.12.10</scala.version>
<spark.version>3.1.1</spark.version>
</properties>
<dependencies>
<!--scala-->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!--spark-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
<!--hadoop-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.0</version>
</dependency>
</dependencies>
</project>
右键pom.xml,选择 Maven -> 重新加载项目
四、编写Spark程序
- 在main包下新建scala目录,并调整为suorces root
- 在test包下新建scala目录,并调整为test suorces root
- 为项目添加scala支持
右键项目名称,选择->添加框架支持
- 在main/scala下添加软件包,然后创建Scala类->Object
- 词频统计Demo ```scala package org.simple
import org.apache.spark.{SparkContext, SparkConf} import org.apache.log4j.{Level, Logger}
object Demo { def main(args: Array[String]): Unit = { Logger.getLogger(“org”).setLevel(Level.ERROR)
val sparkConf = new SparkConf().setAppName("WordCount").setMaster("local")
val sc = new SparkContext(sparkConf)
// 词频统计
val dataRDD = sc.textFile("D:\\SparkProject\\data\\test.txt")
.flatMap(_.split(" "))
.map(x => (x, 1))
.reduceByKey((a, b) => (a + b))
dataRDD.foreach(println)
} }
<a name="DB4Bp"></a>
## 五、运行程序
1. 直接运行
右键直接运行
2. 打包运行
选择文件->项目结构<br /> ![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895595499-b8cf957f-5899-4019-95fc-aa22255fcc5a.png#align=left&display=inline&height=349&margin=%5Bobject%20Object%5D&name=image.png&originHeight=697&originWidth=997&size=123284&status=done&style=none&width=498.5)<br />然后如图所示进行如下操作<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895671454-2eb3dfa7-8755-4615-b4ad-d77fa6f6a9a8.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=80054&status=done&style=none&width=627)<br />选择主类<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895751020-befd14c8-3222-4d35-aaed-b3f115598983.png#align=left&display=inline&height=319&margin=%5Bobject%20Object%5D&name=image.png&originHeight=638&originWidth=821&size=34746&status=done&style=none&width=410.5)<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895828390-6db7b24c-28fa-48bc-894f-9fc879cfc755.png#align=left&display=inline&height=243&margin=%5Bobject%20Object%5D&name=image.png&originHeight=486&originWidth=619&size=32882&status=done&style=none&width=309.5)<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895882712-fe4bccfb-b3bb-495f-93e9-90d946fedfef.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=153115&status=done&style=none&width=627)<br />移除不需要的依赖包,只保留最后一个<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617895940937-6688d579-df09-4e33-ba12-8d5926a9b898.png#align=left&display=inline&height=481&margin=%5Bobject%20Object%5D&name=image.png&originHeight=961&originWidth=1254&size=158637&status=done&style=none&width=627)<br />构建<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/519483/1617896003032-0b339a89-7af0-4fef-a24b-796395aaf515.png#align=left&display=inline&height=217&margin=%5Bobject%20Object%5D&name=image.png&originHeight=434&originWidth=885&size=77523&status=done&style=none&width=442.5)<br />上传到spark集群后运行
```bash
spark-submit --class org.simple.Demo --master spark://spark0:7077 ./SparkDemo.jar