说明:

  • 本文使用的请求模拟工具为: PostMan

开始正文,

目前面临一个问题:集群资源分配不均匀,核心数据出现延迟,yarn web 界面提供的信息不能够分析每个时刻资源的使用情况(任务数量、所使用的核数,任务归属等)。

从官网了解到,yarn 有提供 restful 接口获取 yarn ResourceManager和NodeManager 节点的信息。文档地址:https://hadoop.apache.org/docs/r2.6.5/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html

(因为公司的集群使用的是2.6.x)

The Hadoop YARN web service REST APIs are a set of URI resources that give access to the cluster(集群), nodes(节点), applications, and application historical(历史的) information. The URI resources are grouped into APIs based on the type of information returned. Some URI resources return collections while others return singletons.

我们访问 yarn web界面上获取的信息,都是可以从接口拿到的,可以制作定制的数据。另外,web界面体验有点卡(当提交的任务过多时)

集群相关

Cluster Information API

  • http:///ws/v1/cluster
  • http:///ws/v1/cluster/info

这两个接口都可以拿到集群的信息。如果指定 Accept: application/xml 则返回 xml 格式的数据,如果指定 Accept: application/json 则返回 json 格式的数据。

请求方式 是否可以传参 返回数据格式
GET Json/XML 见下方示例

image.png

返回的数据说明:

字段名称 类型 含义
id long 集群标识
startedOn long 集群启动的时间(ms)
state string ResourceManager的状态:NOTINITED, INITED, STARTED, STOPPED
haState string ResourceManager HA的状态:NOTINITED, INITED, STARTED, STOPPED
resourceManagerVersion string
resourceManagerBuildVersion string
resourceManagerBuildVersionOn string
hadoopVersion string Hadoop版本号
hadoopBuildVersion string
hadoopVersionBuiltOn string 时间

Cluster Metrics API(度量)

  • http:///ws/v1/cluster/metrics

这个接口提供集群资源的总体度量,比如总的运行任务数量、总使用的 cores 等。

如果指定 Accept: application/xml 则返回 xml 格式的数据,如果指定 Accept: application/json 则返回 json 格式的数据。

请求方式 是否可以传参 返回数据格式
GET Json/XML 见下方示例

返回数据格式:

Item Data Type Description
appsSubmitted int The number of applications submitted
appsCompleted int The number of applications completed
appsPending int The number of applications pending
appsRunning int The number of applications running
appsFailed int The number of applications failed
appsKilled int The number of applications killed
reservedMB long The amount of memory reserved in MB
availableMB long The amount of memory available in MB
allocatedMB long The amount of memory allocated in MB
totalMB long The amount of total memory in MB
reservedVirtualCores long The number of reserved virtual cores
availableVirtualCores long The number of available virtual cores
allocatedVirtualCores long The number of allocated virtual cores
totalVirtualCores long The total number of virtual cores
containersAllocated int The number of containers allocated
containersReserved int The number of containers reserved
containersPending int The number of containers pending
totalNodes int The total number of nodes
activeNodes int The number of active nodes
lostNodes int The number of lost nodes
unhealthyNodes int The number of unhealthy nodes
decommissionedNodes int The number of nodes decommissioned
rebootedNodes int The number of nodes rebooted

Cluster Scheduler API

队列

  • http:///ws/v1/cluster/scheduler

Cluster Application Statistics API

Cluster Application API

Cluster Application Attempts API

Cluster Nodes API

  • http:///ws/v1/cluster/nodes

Cluster Node API