1.压测命令

  1. 测试写性能
  2. hadoop jar /export/server/hadoop2.7/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.5-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 128MB
  3. 测试读性能
  4. hadoop jar /export/server/hadoop2.7/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.5-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 128MB
  5. 删除测试生成数据
  6. hadoop jar /export/server/hadoop2.7/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.5-tests.jar TestDFSIO -clean

2.Hadoop参数调优

  1. HDFS参数调优hdfs-site.xml
  2. dfs.namenode.handler.count=20 * log2(Cluster Size),比如集群规模为8台时,此参数设置为60
  3. The number of Namenode RPC server threads that listen to requests from clients. If dfs.namenode.servicerpc-address is not configured then Namenode RPC server threads listen to requests from all nodes.
  4. NameNode有一个工作线程池,用来处理不同DataNode的并发心跳以及客户端并发的元数据操作。对于大集群或者有大 量客户端的集群来说,通常需要增大参数dfs.namenode.handler.count的默认值10。设置该值的一般原则是将其设 置为集群大小的自然对数乘以20,即20logNN为集群大小。
  5. YARN参数调优yarn-site.xml
  6. yarn.nodemanager.resource.memory-mb表示该节点上YARN可使用的物理内存总量,默认是8192MB)、
  7. yarn.scheduler.maximum-allocation-mb 个任务可申请的最多物理内存量,默认是8192MB