Ubuntu14.04 下安装 Cloudera

  1. lsb_release -a
  2. Cloudera 目前对 Ubuntu14.04支持不好,需要解决复杂的依赖问题
  3. 本文以Ubuntu12.04为例,快速搭建Cloudera
  4. 本文结尾会补充Ubuntu14.04安装时一些解决依赖的方法

一、准备工作(所有机器)

1、设置免密码sudo

  1. 首先执行以下命令(该命令用来修改 /etc/sudoers 文件):
  2. vim /etc/sudoers
  3. %sudo ALL=(ALL:ALL) ALL 这行注释掉
  4. 用这句替代刚刚注释掉的那句
  5. %sudo ALL=NOPASSWD: ALL 移动到文件未尾,
  6. 然后再执行以下命令:
  7. sudo adduser `你的用户名` sudo

2、确保正确的apt源,如果没有请添加以下

  • vim /etc/apt/sources.list
  1. deb http://cn.archive.ubuntu.com/ubuntu/ precise main restricted
  2. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise main restricted
  3. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates main restricted
  4. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates main restricted
  5. deb http://cn.archive.ubuntu.com/ubuntu/ precise universe
  6. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise universe
  7. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates universe
  8. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates universe
  9. deb http://cn.archive.ubuntu.com/ubuntu/ precise multiverse
  10. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise multiverse
  11. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates multiverse
  12. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates multiverse
  13. deb http://cn.archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
  14. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
  15. deb http://security.ubuntu.com/ubuntu precise-security main restricted
  16. deb-src http://security.ubuntu.com/ubuntu precise-security main restricted
  17. deb http://security.ubuntu.com/ubuntu precise-security universe
  18. deb-src http://security.ubuntu.com/ubuntu precise-security universe
  19. deb http://security.ubuntu.com/ubuntu precise-security multiverse
  20. deb-src http://security.ubuntu.com/ubuntu precise-security multiverse

3、添加curl

  1. sudo apt-get update(可选)
  2. sudo apt-get install curl -y

二、安装server

1、在/etc/hosts中增加配置(根据实际情况)

  1. 192.168.33.100 CDH
  2. 192.168.33.101 CDH1
  3. 192.168.33.102 CDH2
  4. 192.168.33.103 CDH3
  5. 192.168.33.104 CDH4
  6. 192.168.33.105 CDH5

2、修改/etc/hostname

  1. 修改为 CDH

3、添加Cloudera源 (我们用的是 ubuntu14.04)

  • vim /etc/apt/sources.list.d/cloudera.list
  1. cd /etc/apt/sources.list.d/
  2. 下载地址
  3. wget http://archive-primary.cloudera.com/cm5/ubuntu/trusty/amd64/cm/cloudera.list
  4. # Packages for Cloudera Manager, Version 5, on Ubuntu 14.04 x86_64
  5. deb [arch=amd64] http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm trusty-cm5 contrib
  6. deb-src http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm trusty-cm5 contrib

4、获取apt key

  1. curl -s http://archive-primary.cloudera.com/cm5/ubuntu/trusty/amd64/cm/archive.key| sudo apt-key add -
  2. apt-get update

5、安装java环境

  • 安装jdk
  1. apt-get -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold -y install oracle-j2sdk1.7
  • 配置环境变量,在/etc/profile中添加
  1. export JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
  2. export JRE_HOME=${JAVA_HOME}/jre
  3. export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
  4. export PATH=${JAVA_HOME}/bin:$PATH
  • 执行source /etc/profile

6、安装mysql以及JDBC驱动

  1. sudo apt-get install mysql-server libmysql-java -y

7、配置mysql

  • /etc/mysql/conf.d/mysql_cloudera_manager.cnf
  1. [mysqld]
  2. transaction-isolation=READ-COMMITTED
  3. # Disabling symbolic-links is recommended to prevent assorted security risks;
  4. # to do so, uncomment this line:
  5. # symbolic-links=0
  6. key_buffer = 16M
  7. key_buffer_size = 32M
  8. max_allowed_packet = 16M
  9. thread_stack = 256K
  10. thread_cache_size = 64
  11. query_cache_limit = 8M
  12. query_cache_size = 64M
  13. query_cache_type = 1
  14. # Important: see Configuring the Databases and Setting max_connections
  15. max_connections = 550
  16. # log-bin should be on a disk with enough free space
  17. log-bin=/var/log/mysql/mysql_binary_log
  18. # For MySQL version 5.1.8 or later. Comment out binlog_format for older versions.
  19. binlog_format = mixed
  20. read_buffer_size = 2M
  21. read_rnd_buffer_size = 16M
  22. sort_buffer_size = 8M
  23. join_buffer_size = 8M
  24. # InnoDB settings
  25. innodb_file_per_table = 1
  26. innodb_flush_log_at_trx_commit = 2
  27. innodb_log_buffer_size = 64M
  28. innodb_buffer_pool_size = 4G
  29. innodb_thread_concurrency = 8
  30. innodb_flush_method = O_DIRECT
  31. innodb_log_file_size = 512M
  32. [mysqld_safe]
  33. log-error=/var/log/mysqld.log
  34. pid-file=/var/run/mysqld/mysqld.pidv
  • 编辑my.cnf
  1. vim /etc/mysql/my.cnf
  2. 把下面这一行注释掉
  3. #bind-address = 127.0.0.1
  • 注意事项:
  1. 在安装的过程中一定保证内存足够大,否则会遇到下面问题
  2. 上面配置需要根据自己的实际情况,在配置过程中重启mysql的时候,发生了下面错误
  3. stop: Unknown instance:
  4. start: Job failed to start

8、配置innodb

  1. mv /var/lib/mysql/ib_logfile* /var/tmp/

9、初始化数据库

  • service mysql restart
  • mysql -uroot -p
  • 写入一下sql
  1. create database amon DEFAULT CHARACTER SET utf8;
  2. grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon_password';
  3. grant all on amon.* TO 'amon'@'CDH' IDENTIFIED BY 'amon_password';
  4. create database smon DEFAULT CHARACTER SET utf8;
  5. grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'smon_password';
  6. grant all on smon.* TO 'smon'@'CDH' IDENTIFIED BY 'smon_password';
  7. create database rman DEFAULT CHARACTER SET utf8;
  8. grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman_password';
  9. grant all on rman.* TO 'rman'@'CDH' IDENTIFIED BY 'rman_password';
  10. create database hmon DEFAULT CHARACTER SET utf8;
  11. grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'hmon_password';
  12. grant all on hmon.* TO 'hmon'@'CDH' IDENTIFIED BY 'hmon_password';
  13. create database hive DEFAULT CHARACTER SET utf8;
  14. grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
  15. grant all on hive.* TO 'hive'@'CDH' IDENTIFIED BY 'hive_password';

10、安装 cloudera-manager以及agent(因为master也是一个节点)

  1. apt-get install cloudera-manager-daemons cloudera-manager-server cloudera-manager-agent -y

11、配置cloudera-manager-server数据库

  1. sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql -uroot -p --scm-host localhost scm scm scm_password

12、修改agent的配置文件

  • vim /etc/cloudera-scm-agent/config.ini
    • 修改server_host=CDH

13、更改交换分区频率

  1. echo 'vm.swappiness=0' >> /etc/sysctl.conf

三、配置其余节点(cdh1、cdh2 …)

1、更新/etc/apt/sources.list(如有必要)

  1. CDHcloudera.list 复制到其他节点cdh1cdh2 ...即可
  2. 例如:scp vagrant@192.168.33.100:/etc/apt/sources.list /etc/apt/

2、在/etc/hosts中增加配置(根据实际情况)

  1. CDHhosts,复制到其他节点
  2. 例如:scp vagrant@192.168.33.100:/etc/hosts /etc/

3、修改/etc/hostname

  1. 修改为对应的 CDH1CDH2 ....
  2. echo 'CDH1' > /etc/hostname

4、添加Cloudera源

  • 直接拷贝
  1. scp vagrant@192.168.33.100:/etc/apt/sources.list.d/cloudera.list /etc/apt/sources.list.d/
  • 或者:vim /etc/apt/sources.list.d/cloudera.list
  1. 下载地址
  2. wget http://archive-primary.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh/cloudera.list
  3. # Packages for Cloudera's Distribution for Hadoop, Version 5, on Ubuntu 14.04 amd64
  4. deb [arch=amd64] http://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 contrib
  5. deb-src http://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh trusty-cdh5 contrib

5、获取apt key

6、安装java环境

  • 安装jdk
  1. apt-get -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold -y install oracle-j2sdk1.7
  • 配置环境变量,在/etc/profile中添加
  1. export JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
  2. export JRE_HOME=${JAVA_HOME}/jre
  3. export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
  4. export PATH=${JAVA_HOME}/bin:$PATH
  • 执行source /etc/profile

7、安装cloudera-manager-agent 和cloudera-manager-daemons

  • sudo apt-get install cloudera-manager-agent cloudera-manager-daemons -y

8、修改agent的配置文件

  • vim /etc/cloudera-scm-agent/config.ini
    • 修改server_host=CDH

9、更改交换分区频率

  1. echo 'vm.swappiness=0' >> /etc/sysctl.conf

四、启动服务

1、启动节点agent

  • sudo service cloudera-scm-agent restart

2、重启控制节点cloudera-manager以及agent

  • sudo service cloudera-scm-server restart
  • sudo service cloudera-scm-agent restart

五、关于克隆子节点

1、先搭好子节点

2、克隆节点

  1. vboxmanage clonevm CDH1 --name CDH2 --register

3、修改克隆出节点内容

  • 改ip,/etc/network/interfaces
  • 改hostname,echo ‘CDH?’ > /etc/hostname
  • 改host_id,echo ‘CMF_AGENT_ARGS=”—host_id CDH?”‘ > /etc/default/cloudera-scm-agent

六、如果是14.04 有一下不同点

1、添加Cloudera源

  1. # Packages for Cloudera Manager, Version 5, on Ubuntu 14.04 x86_64
  2. deb [arch=amd64] http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm trusty-cm5 contrib
  3. deb-src http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm trusty-cm5 contrib

2、获取aptkey

  1. curl -s http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm/archive.key| sudo apt-key add -
  2. apt-get update

3、12.04的源

  1. deb http://cn.archive.ubuntu.com/ubuntu/ precise main restricted
  2. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise main restricted
  3. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates main restricted
  4. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates main restricted
  5. deb http://cn.archive.ubuntu.com/ubuntu/ precise universe
  6. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise universe
  7. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates universe
  8. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates universe
  9. deb http://cn.archive.ubuntu.com/ubuntu/ precise multiverse
  10. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise multiverse
  11. deb http://cn.archive.ubuntu.com/ubuntu/ precise-updates multiverse
  12. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-updates multiverse
  13. deb http://cn.archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
  14. deb-src http://cn.archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
  15. deb http://security.ubuntu.com/ubuntu precise-security main restricted
  16. deb-src http://security.ubuntu.com/ubuntu precise-security main restricted
  17. deb http://security.ubuntu.com/ubuntu precise-security universe
  18. deb-src http://security.ubuntu.com/ubuntu precise-security universe
  19. deb http://security.ubuntu.com/ubuntu precise-security multiverse
  20. deb-src http://security.ubuntu.com/ubuntu precise-security multiverse

4、如果遇到python报错:ImportError: No module named _io

  1. mv /usr/lib/cmf/agent/build/env/bin/python /usr/lib/cmf/agent/build/env/bin/python.bak
  2. cp /usr/bin/python2.7 /usr/lib/cmf/agent/build/env/bin/python