We are going to use a Flink Maven Archetype for creating our project structure. Please see Java API Quickstart for more details about this. For our purposes, the command to run is this:
mvn archetype:generate \
-DarchetypeGroupId=org.apache.flink \
-DarchetypeArtifactId=flink-quickstart-java \
-DarchetypeVersion=1.9.0 \
-DgroupId=wiki-edits \
-DartifactId=wiki-edits \
-Dversion=0.1 \
-Dpackage=wikiedits \
-DinteractiveMode=false
You can edit the groupId
, artifactId
and package
if you like. With the above parameters, Maven will create a project structure that looks like this:
$ tree wiki-edits
wiki-edits/
├── pom.xml
└── src
└── main
├── java
│ └── wikiedits
│ ├── BatchJob.java
│ └── StreamingJob.java
└── resources
└── log4j.properties
There is our pom.xml
file that already has the Flink dependencies added in the root directory and several example Flink programs in src/main/java
. We can delete the example programs, since we are going to start from scratch:
$ rm wiki-edits/src/main/java/wikiedits/*.java
As a last step we need to add the Flink Wikipedia connector as a dependency so that we can use it in our program. Edit the dependencies
section of the pom.xml
so that it looks like this:
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-wikiedits_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
</dependencies>
Notice the flink-connector-wikiedits_2.11
dependency that was added. (This example and the Wikipedia connector were inspired by the Hello Samza example of Apache Samza.)