1. 配置Sentry

安装和配置Sentry

2. 配置Hive

Hive Metastore集成Sentry

需要在 /etc/hive/conf/hive-site.xml中添加:

  1. <property>
  2. <name>hive.metastore.pre.event.listeners</name>
  3. <value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
  4. </property>
  5. <property>
  6. <name>hive.metastore.event.listeners</name>
  7. <value>org.apache.sentry.binding.metastore.SentryMetastorePostEventListener</value>
  8. </property>

Hive-server2集成Sentry

修改 /etc/hive/conf/hive-site.xml,添加以下内容:

  1. <property>
  2. <name>hive.server2.enable.impersonation</name>
  3. <value>true</value>
  4. </property>
  5. <property>
  6. <name>hive.security.authorization.task.factory</name>
  7. <value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
  8. </property>
  9. <property>
  10. <name>hive.server2.session.hook</name>
  11. <value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
  12. </property>
  13. <property>
  14. <name>hive.sentry.conf.url</name>
  15. <value>file:///etc/hive/conf/sentry-site.xml</value>
  16. </property>

参考模板sentry-site.xml.hive-client.template在 /etc/hive/conf/ 目录创建 sentry-site.xml:

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <configuration>
  3. <property>
  4. <name>sentry.service.client.server.rpc-port</name>
  5. <value>8038</value>
  6. </property>
  7. <property>
  8. <name>sentry.service.client.server.rpc-address</name>
  9. <value>cdh1</value>
  10. </property>
  11. <property>
  12. <name>sentry.service.client.server.rpc-connection-timeout</name>
  13. <value>200000</value>
  14. </property>
  15. <!--以下是客户端配置-->
  16. <property>
  17. <name>sentry.provider</name>
  18. <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
  19. </property>
  20. <property>
  21. <name>sentry.hive.provider.backend</name>
  22. <value>org.apache.sentry.provider.db.SimpleDBProviderBackend</value>
  23. </property>
  24. <property>
  25. <name>sentry.metastore.service.users</name>
  26. <value>hive</value><!--queries made by hive user (beeline) skip meta store check-->
  27. </property>
  28. <property>
  29. <name>sentry.hive.server</name>
  30. <value>server1</value>
  31. </property>
  32. <property>
  33. <name>sentry.hive.testing.mode</name>
  34. <value>true</value>
  35. </property>
  36. </configuration>

3. 重启Hive

在cdh1上启动或重启hiveserver2:

  1. $ /etc/init.d/hive-server2 restart

4. 准备测试数据

参考 Securing Impala for analysts,准备测试数据:

  1. $ cat /tmp/events.csv
  2. 10.1.2.3,US,android,createNote
  3. 10.200.88.99,FR,windows,updateNote
  4. 10.1.2.3,US,android,updateNote
  5. 10.200.88.77,FR,ios,createNote
  6. 10.1.4.5,US,windows,updateTag

然后,在hive中运行下面 sql 语句:

  1. create database sensitive;
  2. create table sensitive.events (
  3. ip STRING, country STRING, client STRING, action STRING
  4. ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
  5. load data local inpath '/tmp/events.csv' overwrite into table sensitive.events;
  6. create database filtered;
  7. create view filtered.events as select country, client, action from sensitive.events;
  8. create view filtered.events_usonly as select * from filtered.events where country = 'US';

在 cdh1上通过 beeline 连接 hiveserver2,运行下面命令创建角色和组:

  1. # 注意:我设置了hiveserver2的jdbc端口为10001
  2. $ beeline -u "jdbc:hive2://cdh1:10001/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

执行下面的 sql 语句创建 role、group等:

  1. create role admin_role;
  2. GRANT ALL ON SERVER server1 TO ROLE admin_role;
  3. GRANT ROLE admin_role TO GROUP admin;
  4. GRANT ROLE admin_role TO GROUP hive;
  5. create role test_role;
  6. GRANT ALL ON DATABASE filtered TO ROLE test_role;
  7. GRANT ROLE test_role TO GROUP test;

上面创建了两个角色:

  • admin_role,具有管理员权限,可以读写所有数据库,并授权给 admin 和 hive 组(对应操作系统上的组)
  • test_role,只能读写 filtered 数据库,并授权给 test 组。

因为系统上没有test用户和组,所以需要手动创建:

  1. $ useradd test

5. 测试

测试admin_role角色

使用hive用户访问beeline:

  1. $ beeline -u "jdbc:hive2://cdh1:10000/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

  1. 0: jdbc:hive2://cdh1:10000/> set system:user.name;
  2. +------------------------+--+
  3. | set |
  4. +------------------------+--+
  5. | system:user.name=hive |
  6. +------------------------+--+
  7. 1 row selected (0.188 seconds)

hive属于admin_role组,具有管理员权限,可以查看所有角色:

  1. 0: jdbc:hive2://cdh1:10000/> show roles;
  2. +-------------+--+
  3. | role |
  4. +-------------+--+
  5. | test_role |
  6. | admin_role |
  7. +-------------+--+
  8. 2 rows selected (0.199 seconds)

查看所有权限:

  1. 0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE test_role;
  2. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  3. | database | table | partition | column | principal_name | principal_type | privilege | grant_option | grant_time |
  4. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  5. | filtered | | | | test_role | ROLE | * | false | 1430293474047000 |
  6. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  7. 0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE admin_role;
  8. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  9. | database | table | partition | column | principal_name | principal_type | privilege | grant_option | grant_time |
  10. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  11. | * | | | | admin_role | ROLE | * | false | 1430293473308000
  12. +-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
  13. 1 row selected (0.16 seconds)

hive用户可以查看所有数据库、访问所有表:

  1. 0: jdbc:hive2://cdh1:10000/> show databases;
  2. +----------------+--+
  3. | database_name |
  4. +----------------+--+
  5. | default |
  6. | filtered |
  7. | sensitive |
  8. +----------------+--+
  9. 3 rows selected (0.391 seconds)
  10. 0: jdbc:hive2://cdh1:10000/> use filtered;
  11. No rows affected (0.101 seconds)
  12. 0: jdbc:hive2://cdh1:10000/> select * from filtered.events;
  13. +-----------------+----------------+----------------+--+
  14. | events.country | events.client | events.action |
  15. +-----------------+----------------+----------------+--+
  16. | US | android | createNote |
  17. | FR | windows | updateNote |
  18. | US | android | updateNote |
  19. | FR | ios | createNote |
  20. | US | windows | updateTag |
  21. +-----------------+----------------+----------------+--+
  22. 5 rows selected (0.431 seconds)
  23. 0: jdbc:hive2://cdh1:10000/> select * from sensitive.events;
  24. +---------------+-----------------+----------------+----------------+--+
  25. | events.ip | events.country | events.client | events.action |
  26. +---------------+-----------------+----------------+----------------+--+
  27. | 10.1.2.3 | US | android | createNote |
  28. | 10.200.88.99 | FR | windows | updateNote |
  29. | 10.1.2.3 | US | android | updateNote |
  30. | 10.200.88.77 | FR | ios | createNote |
  31. | 10.1.4.5 | US | windows | updateTag |
  32. +---------------+-----------------+----------------+----------------+--+
  33. 5 rows selected (0.247 seconds)

测试test_role角色

使用test用户访问beeline:

  1. $ beeline -u "jdbc:hive2://cdh1:10000/" -n test -p test -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

  1. 0: jdbc:hive2://cdh1:10000/> set system:user.name;
  2. +------------------------+--+
  3. | set |
  4. +------------------------+--+
  5. | system:user.name=hive |
  6. +------------------------+--+
  7. 1 row selected (0.188 seconds)

test用户不是管理员,是不能查看所有角色的:

  1. 0: jdbc:hive2://cdh1:10000/> show roles;
  2. ERROR : Error processing Sentry command: Access denied to test. Server Stacktrace: org.apache.sentry.provider.db.SentryAccessDeniedException: Access denied to test

test用户可以列出所有数据库:

  1. 0: jdbc:hive2://cdh1:10000/> show databases;
  2. +----------------+--+
  3. | database_name |
  4. +----------------+--+
  5. | default |
  6. | filtered |
  7. | sensitive |
  8. +----------------+--+
  9. 3 rows selected (0.079 seconds)

test用户可以filtered库:

  1. 0: jdbc:hive2://cdh1:10000/> use filtered;
  2. No rows affected (0.206 seconds)
  3. 0: jdbc:hive2://cdh1:10000/> select * from events;
  4. +-----------------+----------------+----------------+--+
  5. | events.country | events.client | events.action |
  6. +-----------------+----------------+----------------+--+
  7. | US | android | createNote |
  8. | FR | windows | updateNote |
  9. | US | android | updateNote |
  10. | FR | ios | createNote |
  11. | US | windows | updateTag |
  12. +-----------------+----------------+----------------+--+
  13. 5 rows selected (0.361 seconds)

但是,test用户没有权限访问sensitive库:

  1. 0: jdbc:hive2://cdh1:10000/> use sensitive;
  2. Error: Error while compiling statement: FAILED: SemanticException No valid privileges
  3. Required privileges for this query: Server=server1->Db=sensitive->Table=*->action=insert;Server=server1->Db=sensitive->Table=*->action=select; (state=42000,code=40000)

6. 排错

在CDH5的高版本中,hive cli 不建议使用,在hive集成sentry之后,再运行hive cli 会提示找不到sentry的类的遗产,解决办法是,将sentry相关的jar包链接到hive的home目录下的lib目录下:

  1. ln -s /usr/lib/sentry/lib/sentry-binding-hive.jar /usr/lib/hive/lib/
  2. ln -s /usr/lib/sentry/lib/sentry-core-common.jar /usr/lib/hive/lib
  3. ln -s /usr/lib/sentry/lib/sentry-core-common-db.jar /usr/lib/hive/lib
  4. ln -s /usr/lib/sentry/lib/sentry-policy-common.jar /usr/lib/hive/lib
  5. ln -s /usr/lib/sentry/lib/sentry-policy-db.jar /usr/lib/hive/lib
  6. ln -s /usr/lib/sentry/lib/sentry-policy-cache.jar /usr/lib/hive/lib
  7. ln -s /usr/lib/sentry/lib/sentry-provider-common.jar /usr/lib/hive/lib
  8. ln -s /usr/lib/sentry/lib/sentry-provider-db.jar /usr/lib/hive/lib
  9. ln -s /usr/lib/sentry/lib/sentry-provider-cache.jar /usr/lib/hive/lib
  10. ln -s /usr/lib/sentry/lib/sentry-provider-file.jar /usr/lib/hive/lib

7. 参考文章