资源规划
组件 | bigdata-node1 | bigdata-node2 | bigdata-node3 |
---|---|---|---|
OS | centos7.6 | centos7.6 | centos7.6 |
JDK | jvm | jvm | jvm |
Zeppelin | ZeppelinServer | N.A | N.A |
安装介质
版本:zeppelin-0.9.0-preview1-bin-all.tgz<br /> 下载:[https://mirror.bit.edu.cn/apache/zeppelin/zeppelin-0.9.0-preview1/zeppelin-0.9.0-preview1-bin-all.tgz](https://mirror.bit.edu.cn/apache/zeppelin/zeppelin-0.9.0-preview1/zeppelin-0.9.0-preview1-bin-all.tgz)
安装Zeppelin
解压缩
cd /share
wget https://mirror.bit.edu.cn/apache/zeppelin/zeppelin-0.9.0-preview1/zeppelin-0.9.0-preview1-bin-all.tgz
tar -zxvf zeppelin-0.9.0-preview1-bin-all.tgz -C ~/modules/
rm -rf zeppelin-0.9.0-preview1-bin-all.tgz
零配置启动
Zeppelin在不用做任何配置修改的情况下即可正常启动。进入bin目录下,执行启动命令。
cd ~/modules/zeppelin-0.9.0-preview1-bin-all/bin
./zeppelin-daemon.sh start
如果显示如下结果,则表示启动正常:
Zeppelin start [ OK ]
此时,只能在安装本机进行浏览器访问。
curl http://127.0.0.1:8080
Web UI:http://127.0.0.1:8080
自定义配置
访问端口
cd ~/modules/zeppelin-0.9.0-preview1-bin-all/conf
cp zeppelin-site.xml.template zeppelin-site.xml
vi zeppelin-site.xml
修改如下配置:
<property>
<name>zeppelin.server.addr</name>
<value>0.0.0.0</value>
<description>Server address.</description>
</property>
<property>
<name>zeppelin.server.port</name>
<value>9527</value>
<description>Server port.</description>
</property>
用户认证
Zeppelin默认运行匿名用户访问,即没有用户权限要求,如要实现用户权限限制,则需修改zeppelin-site.xml和shiro配置文件。
修改zeppelin-site.xml配置文件,将以下配置项中的“true”改成“false”。
cd ~/modules/zeppelin-0.9.0-preview1-bin-all/conf
vi zeppelin-site.xml
配置如下:
<property>
<name>zeppelin.anonymous.allowed</name>
<value>false</value>
<description>Anonymous user allowed by default</description>
</property>
新增shiro权限配置。
cd ~/modules/zeppelin-0.9.0-preview1-bin-all/conf
cp shiro.ini.template shiro.ini
vi shiro.ini
配置如下:
[users]
admin = password1, admin
注意:逗号前面是用户名,逗号后面是登录密码。
重新启动Zeppelin并验证。
cd ~/modules/zeppelin-0.9.0-preview1-bin-all/bin
./zeppelin-daemon.sh restart
重启后,刷新Web页面,发现已经看不到之前的应用Notebook,点击右上角的Login按钮,出现登录框,输入用户和密码登录。
Web UI:http://bigdata-node1:9527
账户口令:admin/password1
注意:当进入页面右上角显示为“**anonymous”(匿名)状态时,请点击“anonymous”,之后单击“Interpreter”或者“Notebook Repos”或者“Configuration”均可进入登录页面。**Interpreter
1. Flink Interpreter
配置(Web UI)
FLINK_HOME /home/vagrant/modules/flink-1.10.0
HADOOP_CONF_DIR /home/vagrant/modules/hadoop-2.7.2/etc/hadoop
HIVE_CONF_DIR/home /home/vagrant/modules/apache-hive-2.3.4-bin/conf
flink.execution.mode remote
flink.execution.remote.host 192.168.0.101
flink.execution.remote.port 8381
zeppelin.flink.enableHive true
zeppelin.flink.hive.version 2.3.4
验证 ```scala %flink
// 示例(Flink批处理) val data = benv.fromElements(“hello world”, “hello flink”, “hello hadoop”) data.flatMap(line => line.split(“\s”)) .map(w => (w, 1)) .groupBy(0) .sum(1) .print()
<a name="AToGX"></a>
### 2. Spark Interpreter
- **配置**
1. 配置zeppelin-env.sh。
```bash
export JAVA_HOME=/home/vagrant/modules/jdk1.8.0_221
export MASTER=spark://bigdata-node1:7077
export SPARK_HOME=/home/vagrant/modules/spark-2.0.0
export HADOOP_CONF_DIR=/home/vagrant/modules/hadoop-2.7.2/etc/hadoop
- 配置Spark Interpreter(Web UI)。
master spark://bigdata-node1:7077
- master配置
- local[*] in local mode
- spark://master:7077 in standalone cluster
- yarn-client in Yarn client mode
- yarn-cluster in Yarn cluster mode
- mesos://host:5050 in Mesos cluster
- 配置file(HDFS,Web UI)
hdfs.url hdfs://bigdata-node1:9000
hdfs.user vagrant
- 数据准备 ```scala import org.apache.commons.io.IOUtils import java.net.URL import java.nio.charset.Charset
val bankText = sc.parallelize( IOUtils.toString( new URL(“https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv“), Charset.forName(“utf8”)).split(“\n”))
case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer)
val bank = bankText.map(s => s.split(“;”)).filter(s => s(0) != “\”age\””).map( s => Bank(s(0).toInt, s(1).replaceAll(“\””, “”), s(2).replaceAll(“\””, “”), s(3).replaceAll(“\””, “”), s(5).replaceAll(“\””, “”).toInt ) ).toDF() bank.registerTempTable(“bank”)
- **验证**
```scala
%sql
select age, count(1) value
from bank
where age < 30
group by age
order by age
3. JDBC Interpreter
■ MySQL
依赖
cp /home/vagrant/modules/apache-hive-2.3.4-bin/lib/mysql-connector-java-*.jar /home/vagrant/modules/zeppelin-0.9.0-preview1-bin-all/interpreter/jdbc/
配置(Web UI)
default.url jdbc:mysql://bigdata-node3:3306
default.user root
default.password 123456
default.driver com.mysql.jdbc.Driver
验证 ```sql %mysql
— show databases; — use mysql; show tables;
<a name="kx858"></a>
#### ■ Hive
- **依赖**
```bash
cp /home/vagrant/modules/apache-hive-2.3.4-bin/jdbc/hive-jdbc-2.3.4-standalone.jar /home/vagrant/modules/zeppelin-0.9.0-preview1-bin-all/interpreter/jdbc/
配置
default.url jdbc:hive2://bigdata-node1:10000
default.driver org.apache.hive.jdbc.HiveDriver
default.user hive2
验证 ```sql %hive
show databases; — use default; — show tables; ```
参考
CSDN:Apache Zeppelin主要界面和基本操作
https://blog.csdn.net/majianxiong_lzu/article/details/89318434
简书:Spark & Zeppelin
https://www.jianshu.com/p/297c3893d7e7
简书:如何在Apache Zeppelin中玩转Spark (1)
https://www.jianshu.com/p/fd049b2887c1