Yarn核心参数配置案例

学习用的虚拟机中，修改核心参数前尽量先拍摄一个虚拟机快照，方便后面恢复成原始状态进行其他配置

例如：从 1G 数据中，统计每个单词出现次数。

服务器：3台，每台4G内存，每台4核4线程的CPU

分析：

1G / 128M（切片大小） = 8 个MapTask；

统计单词个数，可以输出到一个结果文件，即 1 个ReduceTask；

统计单词个数只需要一个MapReduce任务，所以 MRAppMaster 1个；

共计需要 8 + 1 + 1 = 10 个Container。

平均每个 NodeManager 需要运行： 10个 / 3 台 = 3 个任务（4 / 3 / 3）

修改yarn-site.xml配置：

<property>
  <!-- 选择调度器，例如容量调度器 -->
  <description>The class to use as the resource scheduler.</description>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>
<property>
  <!-- ResourceManager处理器调度器请求的线程数量，默认50 -->
  <!-- 如果提交的任务数量大于50，可以增加该值，但是不能超过 3台 * 4线程 = 12 线程（去除其他应用程序，实际上不能超过8） -->
  <description>Number of threads to handle scheduler interface.</description>
  <name>yarn.resourcemanager.scheduler.client.thread-count</name>
  <value>8</value>
</property>
<property>
  <!-- 是否让yarn自动检测硬件进行配置，默认false -->
  <!-- 如果该节点有很多其他应用程序，建议手动配置 -->
  <!-- 如果该节点没有其他应用程序，可以采用自动配置 -->
  <description>Enable auto-detection of node capabilities such as memory and CPU.</description>
  <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
  <value>false</value>
</property>
<property>
  <!-- 是否将虚拟核数当做cpu核数，默认值false，采用物理核数 -->
  <description>Flag to determine if logical processors(such as
    hyperthreads) should be counted as cores. Only applicable on Linux
    when yarn.nodemanager.resource.cpu-vcores is set to -1 and
    yarn.nodemanager.resource.detect-hardware-capabilities is true.
  </description>
  <name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
  <value>false</value>
</property>
<property>
  <!-- 虚拟核数和物理核数乘数，默认值1.0 -->
  <!-- 此处我们的服务器时4核4线程，即核数和线程数比值为1.0 -->
  <description>Multiplier to determine how to convert phyiscal cores to
    vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
    is set to -1(which implies auto-calculate vcores) and
    yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The
    number of vcores will be calculated as
    number of CPUs * multiplier.
  </description>
  <name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
  <value>1.0</value>
</property>
<property>
  <!-- NodeManager使用内存，默认设置的 -1，即不开启硬件检测时默认8G，开启的话自动计算 -->
  <!-- 这里我们服务器是4G，需要调整为4G -->
  <description>Amount of physical memory, in MB, that can be allocated 
    for containers. If set to -1 and
    yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
    automatically calculated(in case of Windows and Linux).
    In other cases, the default is 8192MB.
  </description>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>4096</value>
</property>
<property>
  <!-- NodeManager的CPU核数，默认值-1。即不开启硬件检测时默认8，开启的话自动计算-->
  <!-- 此处我们的服务器只有4核4线程 -->
  <description>Number of vcores that can be allocated
    for containers. This is used by the RM scheduler when allocating
    resources for containers. This is not used to limit the number of
    CPUs used by YARN containers. If it is set to -1 and
    yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
    automatically determined from the hardware in case of Windows and Linux.
    In other cases, number of vcores is 8 by default.</description>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>4</value>
  </property>
    <property>
    <!-- 容器最小内存，默认1G -->
    <description>The minimum allocation for every container request at the RM
    in MBs. Memory requests lower than this will be set to the value of this
    property. Additionally, a node manager that is configured to have less memory
    than this value will be shut down by the resource manager.</description>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
  </property>
    <property>
    <!-- 容器最大内存，默认8G -->
    <!-- 此处我们的服务器只有4G内存，根据前面分析，每台服务器要启动3个容器，所以容器最大内存可以修改为 2G -->
    <description>The maximum allocation for every container request at the RM
    in MBs. Memory requests higher than this will throw an
    InvalidResourceRequestException.</description>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
  </property>
    <property>
    <!-- 容器最小CPU核数，默认1个 -->
    <description>The minimum allocation for every container request at the RM
    in terms of virtual CPU cores. Requests lower than this will be set to the
    value of this property. Additionally, a node manager that is configured to
    have fewer virtual cores than this value will be shut down by the resource
    manager.</description>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>1</value>
  </property>
    <property>
    <!-- 容器最大CPU核数，默认值4 -->
    <!-- 此处我们的服务器是4核，根据前面分析每台服务器要启动3个容器，所以容器最大CPU核数设置为2个 -->
    <description>The maximum allocation for every container request at the RM
    in terms of virtual CPU cores. Requests higher than this will throw an
    InvalidResourceRequestException.</description>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>2</value>
  </property>
    <property>
    <!-- 虚拟内存检测，默认打开 -->
    <!-- 如果是 CentOS 7 + JDK 8，建议关闭该检测 -->
    <description>Whether virtual memory limits will be enforced for
    containers.</description>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>
    <property>
    <!-- 虚拟内存和物理内存比例（用作虚拟内存检测的限制），默认值2.1 -->
    <description>Ratio between virtual memory to physical memory when
    setting memory limits for containers. Container allocations are
    expressed in terms of physical memory, and virtual memory usage
    is allowed to exceed this allocation by this ratio.
  </description>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>
  </property>

关闭虚拟内存检测的原因：

根据yarn.nodemanager.vmem-pmem-ration参数的配置的虚拟内存比例，该参数默认值是 2.1。即如果物理内存是 4G，Hadoop就会认为该节点上存在 8.4 G的虚拟内存的限制：vmemLimit = pmem * vmem-pmem-ration（pmem：物理内存）。

在CentOS 7 + JDK 8时，Linux会为 Java 进程预留一部分虚拟内存（占了很大一部分），供 JDK 使用，但是 JDK 8没有使用这部分，而是使用了剩下的部分（即图上Java堆的部分），就导致浪费了很大的内存。而且因为剩下的这部分内存的量很少，不到4G，就会导致虚拟内存很容易超过了上限，如果实际使用的vmem超过了vmemLimit或者olderThanAge就可能 kill 掉该进程。

修改yarn-site.xml后，分发到集群所有节点。

实际生产环境中，可能每天服务器节点的配置都不一样，需要为每个节点进行单独配置，不能直接复制分发。

然后重启yarn：

sbin/stop-yarn.sh
sbin/start-yarn.sh

浏览器查看相关配置信息：http://hadoop103:8088
容量调度器多队列提交案例

容量调度器一般在中小型公司中使用。

多队列配置

生产环境配置调度器多队列：

●调度器默认只有1个default队列，不能满足生产要求
●可以按照框架区分多队列，例如创建：hive队列、spark队列、flink队列等，每个框架的任务放入指定的队列（企业里面很少使用这中方式）
●可以按照业务模块区分多队列，例如创建：登录模块队列、购物车队列、下单队列、部门1队列、部门2队列等

创建多队列的优势：

●防止程序中出现递归死循环等代码，把所有的资源全部耗尽
●实现任务的降级使用，特殊时期可以保证重要的任务队列资源充足

例如：default队列占总内存 40% ，最大资源容量占总资源 60%（即不够的时候最多可以向其他队列借到60%）；hive队列占总内存60%，最大资源容量占总资源 80%（即不够的时候最多可以向其他队列借到80%）。配置队列的优先级。

配置多队列的容量调度器，在 $HADOOP_HOME/etc/hadoop/capacity-scheduler.xml配置文件中配置。

在 capacity-scheduler.xml中修改配置：

<!-- yarn.scheduler.capacity.root.queues前面的配置项保持默认即可  -->
<property>
  <!-- 为容量调度器root指定多队列，默认值default -->
  <!-- 配置为 default,hive，即增加一个hive队列 -->
  <name>yarn.scheduler.capacity.root.queues</name>
  <value>default,hive</value>
  <description>
    The queues at the this level (root is the root queue).
  </description>
</property>
<property>
  <!-- root调度器下的default队列的内存容量，默认100% -->
  <!-- 根据前面的需求，调整为 40% -->
  <name>yarn.scheduler.capacity.root.default.capacity</name>
  <value>40</value>
  <description>Default queue target capacity.</description>
</property>
<property>
  <!-- hive队列的内存容量，默认没有该队列，需要增加 -->
  <!-- 根据前面的需求，调整为 60% -->
  <name>yarn.scheduler.capacity.root.hive.capacity</name>
  <value>40</value>
  <description>Default queue target capacity.</description>
</property>
<property>
  <!-- default队列中，单个用户最多占用的资源比例，默认1（即可以占用完default队列的所有资源） -->
  <!-- 可以根据实际需求进行调整，防止某一个用户的死循环等操作将整个队列资源全部耗尽 -->
  <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
  <value>1</value>
  <description>
    Default queue user limit a percentage from 0.0 to 1.0.
  </description>
</property>
<property>
  <!-- hive队列中，单个用户最多占用的资源比例。默认没有该队列，需要自行添加 -->
  <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
  <value>1</value>
  <description>
    Default queue user limit a percentage from 0.0 to 1.0.
  </description>
</property>
<property>
  <!-- default队列，最大可以占用的资源容量，默认100% -->
  <!-- 根据前面的需求，调整为60%（default队列的资源容量为40%，所以最大可以再向其他队列借调20%） -->
  <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
  <value>60</value>
  <description>
    The maximum capacity of the default queue. 
  </description>
</property>
<property>
  <!-- hive队列，最大可以占用的资源容量，默认没有该队列，需要自行添加-->
  <name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
  <value>80</value>
  <description>
    The maximum capacity of the default queue. 
  </description>
</property>
<property>
  <!-- default队列的状态，默认值RUNNING启动，不需要修改 -->
  <name>yarn.scheduler.capacity.root.default.state</name>
  <value>RUNNING</value>
  <description>
    The state of the default queue. State can be one of RUNNING or STOPPED.
  </description>
</property>
<property>
  <!-- hive队列的状态，默认没有该项，需要自行添加 -->
  <name>yarn.scheduler.capacity.root.hive.state</name>
  <value>RUNNING</value>
  <description>
    The state of the default queue. State can be one of RUNNING or STOPPED.
  </description>
</property>
<property>
  <!-- default队列任务提交的acl权限，默认*（即所有用户都可以向该队列进行提交），不需要调整 -->
  <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
  <value>*</value>
  <description>
    The ACL of who can submit jobs to the default queue.
  </description>
</property>
<property>
  <!-- hive队列任务提交的acl权限，默认没有该队列，需要自行添加 -->
  <name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
  <value>*</value>
  <description>
    The ACL of who can submit jobs to the default queue.
  </description>
  </property>
    <property>
    <!-- default队列操作管理的acl权限，默认*（即所有用户都可以对队列任务进行kill等操作），不需要调整 -->
    <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
    <value>*</value>
    <description>
    The ACL of who can administer jobs on the default queue.
  </description>
  </property>
    <property>
    <!-- hive队列操作管理的acl权限，默认没有该队列，需要自行添加 -->
    <name>yarn.scheduler.capacity.root.hive.acl_administer_queue</name>
    <value>*</value>
    <description>
    The ACL of who can administer jobs on the default queue.
  </description>
  </property>
    <property>
    <!-- default队列的提交任务优先级设置的acl权限，默认*（即所有用户都可以设置队列中的优先级），不需要调整 -->
    <name>yarn.scheduler.capacity.root.default.acl_application_max_priority</name>
    <value>*</value>
    <description>
    The ACL of who can submit applications with configured priority.
    For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
  </description>
  </property>
    <property>
    <!-- hive队列的提交任务优先级设置的acl权限，默认没有该队列，需要自行添加 -->
    <name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</name>
    <value>*</value>
    <description>
    The ACL of who can submit applications with configured priority.
    For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
  </description>
  </property>
    <property>
    <!-- default队列的application能够指定的最大超时时间 -->
    <!-- 如果application指定了超时时间，则提交到该队列的application能够指定的最大超时时间不能超过该值 -->
    <!-- 任务的超时时间设置：yarn application -appId <app_id> -updateLifetime <Timeout>  -->
    <!-- 任务执行时间如果超过了指定的超时时间，将会被kill掉 -->
    <name>yarn.scheduler.capacity.root.default.maximum-application-lifetime
  </name>
    <value>-1</value>
    <description>
    Maximum lifetime of an application which is submitted to a queue
    in seconds. Any value less than or equal to zero will be considered as
    disabled.
    This will be a hard time limit for all applications in this
    queue. If positive value is configured then any application submitted
    to this queue will be killed after exceeds the configured lifetime.
    User can also specify lifetime per application basis in
    application submission context. But user lifetime will be
    overridden if it exceeds queue maximum lifetime. It is point-in-time
    configuration.
    Note : Configuring too low value will result in killing application
    sooner. This feature is applicable only for leaf queue.
  </description>
  </property>
    <property>
    <!-- 默认没有hive队列，需要自行添加 -->
    <name>yarn.scheduler.capacity.root.default.maximum-application-lifetime
  </name>
    <value>-1</value>
    <description>
    Maximum lifetime of an application which is submitted to a queue
    in seconds. Any value less than or equal to zero will be considered as
    disabled.
    This will be a hard time limit for all applications in this
    queue. If positive value is configured then any application submitted
    to this queue will be killed after exceeds the configured lifetime.
    User can also specify lifetime per application basis in
    application submission context. But user lifetime will be
    overridden if it exceeds queue maximum lifetime. It is point-in-time
    configuration.
    Note : Configuring too low value will result in killing application
    sooner. This feature is applicable only for leaf queue.
  </description>
  </property>
    <property>
    <!-- default队列，如果没有为application指定超时时间，则使用 default-application-lifetime 作为默认值 -->
    <name>yarn.scheduler.capacity.root.default.default-application-lifetime
  </name>
    <value>-1</value>
    <description>
    Default lifetime of an application which is submitted to a queue
    in seconds. Any value less than or equal to zero will be considered as
    disabled.
    If the user has not submitted application with lifetime value then this
    value will be taken. It is point-in-time configuration.
    Note : Default lifetime can't exceed maximum lifetime. This feature is
    applicable only for leaf queue.
  </description>
  </property>
    <property>
    <!-- 默认没有hive队列，需要自行添加 -->
    <name>yarn.scheduler.capacity.root.default.default-application-lifetime
  </name>
    <value>-1</value>
    <description>
    Default lifetime of an application which is submitted to a queue
    in seconds. Any value less than or equal to zero will be considered as
    disabled.
    If the user has not submitted application with lifetime value then this
    value will be taken. It is point-in-time configuration.
    Note : Default lifetime can't exceed maximum lifetime. This feature is
    applicable only for leaf queue.
  </description>
  </property>
    <!-- 后面的配置和容量调度器root没有关系，保持默认即可 -->

修改好配置文件后，分发到集群中其他节点。

如果此时Yarn集群正在启动着，可以执行刷新队列配置信息命令，不需要重启集群：

# 刷新Yarn队列配置信息
# yarn rmadmin不能刷新 yarn-site.xml。如果修改了 yarn-site.xml，只能重启Yarn集群
yarn rmadmin -refreshQueues

启动wordcount程序，指定提交的队列：

# 指定提交到hive队列
# -D 运行时改变参数值
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar wordcount -D mapreduce.job.queuename=hive /input /output

如果运行的是自己编写的 java 程序，也可以通过configuration对象指定提交的队列：

// 指定提交到hive队列
configuration.set("mapreduce.job.queuename", "hive");

任务优先级配置

容量调度器支持任务优先级的配置，在资源紧张时，优先级高的任务将优先获取资源。

默认情况下，Yarn将所有任务的优先级限制为0，若想使用任务的优先级功能，必须开放该限制。

修改yarn-site.xml，增加参数：

<property>
    <!-- 设置Yarn的任务优先级，默认值0 -->
    <!-- 设置5，表示我们可以有5个优先级：0/1/2/3/4/5，数字越大优先级越高 -->
    <name>yarn.cluster.max-application-priority</name>
    <value>5</value>
</property>

分发到集群其他节点。

重启Yarn集群：

sbin/stop-yarn.sh
sbin/start-yarn.sh

当集群中资源不足出现排队时，可以通过调整任务的优先级达到优先执行的目的：

# 在任务启动时就指定任务的优先级
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar wordcount -D mapreduce.job.priority=5 /input /output
# 也可以通过命令修改正在执行的任务的优先级
yarn application -appID <app_id> -updatePriority 5

公平调度器案例

需求：除了默认的default队列，再创建两个队列，分别是 test 和 tengyer（用户所属组）。希望实现以下效果：若用户提交任务时指定队列，则任务提交到指定队列运行；若未指定队列，test用户提交的任务到 root.group.test 队列运行，tengyer用户提交的任务到 root.group.tengyer队列运行（group为用户所属组）。

公平调度器的配置涉及两个文件：一个是 yarn-site.xml，一个是公平调度器队列分配文件 fair-scheduler.xml（文件名可自定义）。

公平调度器在中大型公司中被广泛使用。

修改yarn-site.xml，进行以下配置：

<property>
    <!-- 使用公平调度器 -->
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
    <!-- 声明公平调度器队列分配的配置文件 -->
    <name>yarn.scheduler.fair.allocation.file</name>
    <value>/opt/module/hadoop-3.2.3/etc/hadoop/fair-scheduler.xml</value>
</property>
<property>
    <!-- 禁止队列间资源抢占 -->
    <name>yarn.scheduler.fair.preemption</name>
    <value>false</value>
</property>

配置 fair-scheduler.xml：

<?xml version="1.0"?>
<allocations>
    <!-- 单个队列中 Application Master占用资源的最大比例，取值 0-1，企业一般配置 0.1 -->
    <queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
    <!-- 单个队列最大资源的默认值 test tenger default -->
    <queueMaxResourcesDefault>4096mb,4vcores</queueMaxResourcesDefault>
    <!-- 增加一个队列test -->
    <queue name="test">
        <!-- 队列最小资源 -->
        <minResources>2048mb,2vcores</minResources>
        <!-- 队列最大资源 -->
        <maxResources>4096mb,4vcores</maxResources>
        <!-- 队列中最多同时运行的应用数，默认50，根据线程数配置 -->
        <maxRunningApps>4</maxRunningApps>
        <!-- 队列中 Application Master 占用资源的最大比例 -->
        <maxAMShare>0.5</maxAMShare>
        <!-- 该队列资源权重，默认值1.0 -->
        <weight>1.0</weight>
        <!-- 队列内部的资源分配策略 -->
        <schedulingPolicy>fair</schedulingPolicy>
    </queue>
    <!-- 增加一个队列tengyer -->
    <!-- 当type设置为parent时，它会成为父队列 -->
    <queue name="tengyer" type="parent">
        <!-- 队列最小资源 -->
        <minResources>2048mb,2vcores</minResources>
        <!-- 队列最大资源 -->
        <maxResources>4096mb,4vcores</maxResources>
        <!-- 队列中最多同时运行的应用数，默认50，根据线程数配置 -->
        <maxRunningApps>4</maxRunningApps>
        <!-- 队列中 Application Master, maxAMShare只能用于叶子队列，不能用于父队列。所以此处不能配置maxAMShare，否则ResourceMananger启动不了。 -->
        <!--<maxAMShare>0.5</maxAMShare>-->
        <!-- 该队列资源权重，默认值1.0 -->
        <weight>1.0</weight>
        <!-- 队列内部的资源分配策略 -->
        <schedulingPolicy>fair</schedulingPolicy>
    </queue>
    <!-- 任务队列分配策略，可配置多层规则，从第一个规则开始匹配，直到匹配成功 -->
    <queuePlacementPolicy>
        <!-- 任务队列分配策略，如果未指定提交队列，则继续匹配下一个规则；false表示：如果指定队列不存在，不允许自动创建 -->
        <rule name="specified" create="false" />
        <!-- 提交到 root.group.username 队列，若 root.group 不存在，不允许自动创建；若 root.group.user 不存在，允许自动创建 -->
        <rule name="nestedUserQueue" create="true">
            <rule name="primaryGroup" create="false" />
        </rule>
        <!-- 最后一个规则必须是reject或者default。reject表示如果前面的条件都不满足，则拒绝创建队列，提交失败。default表示把任务提交到default队列 -->
        <!-- 或者配置成默认： name="default" queue="指定一个默认队列" -->
        <rule name="reject" />        
    </queuePlacementPolicy>
</allocations>

将配置文件分发到集群其他节点服务器。

重启yarn集群。

提交任务时指定队列：

根据配置的分配策略，如果指定了队列，那么会到指定的队列中执行。但是如果指定的队列不存在，则不允许创建队列。

<rule name="specified" create="false" />

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar wordcount -D mapreduce.job.queuename=root.test /input /output

根据前面配置的策略，如果不指定队列，则提交到对应用户名称的队列，即root.tengyer.tengyer队列（root.用户组.用户名）：

我们声明了一个队列root.tengyer的type="parent"。然后在分配策略中指定了：不允许创建primaryGroup（root.用户组父队列），但是允许创建nestedUserQueue（root.用户组.用户名队列），所以可以在root.tengyer父队列中创建出root.tengyer.tengyer队列。

<rule name="nestedUserQueue" create="true">
    <rule name="primaryGroup" create="false" />
</rule>

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar wordcount /input /output

大数据Hadoop

24-Yarn配置案例

Yarn核心参数配置案例

任务优先级配置

公平调度器案例