原文档:Ambari参数优化汇总.docx

Ambari Server2.7.3配置参数优化

Ambari Server配置文件/etc/ambari-server/conf/ambari.properties

1、√upgrade.parameter.nn-restart.timeout

【默认值】:
Ambari参数优化汇总 - 图1
【参数说明】:
在大规模的集群环境下NameNode节点重启比较耗时,为确保ambari server和NameNode之间不会连接超时,需要根据现场NameNode节点重启耗时调整ambari server配置参数/etc/ambari-server/conf/
ambari.properties——建议值:实际耗时(1+10%)upgrade.parameter.nn-restart.timeout=660

参数修改之后务必要重启ambari server服务再进行HDP升级
sudo ambari-server restart
【建议值】:
*upgrade.parameter.nn-restart.timeout=3600

【备注】:
初步确认湖北现场NameNode节点重启需要40分钟左右

2、√环境变量AMBARI_JVM_ARGS

【默认值】:
/var/lib/ambari-server/ambari-env.sh
Ambari参数优化汇总 - 图2
【参数说明】:
JVM内存设置
【建议值】:
-Xms102400m -Xmx102400m -Xmn25600m -XX:NewRatio=3 -XX:MaxPermSize=1024m -XX:PermSize=512m
【备注】:
湖北现场环境有1300多个节点,现场环境内存信息如下
Ambari参数优化汇总 - 图3
Ambari参数优化汇总 - 图4

Ambari参数优化汇总 - 图5

Cluster Nodes Xmx Xmn
100 - 400 4 GB 2 GB
400 - 800 4 GB 2 GB
800 - 1200 8 GB 2 GB
1200 - 1600 16 GB 2.4 GB

生产环境中ms一般设置成跟mx相等,因为若ms不等于mx那么在某些场景下JVM可能需要对Heap Size进行频繁的扩展和收缩,增加处理时间。-XX:MaxPermSize 设置持久代最大值,物理内存的1/4; -XX:PermSize 设置持久代(perm gen)初始值,物理内存的1/64 —— 查看系统内存大小free –h

3、√Jdbc连接池参数和mysql数据库参数

【默认值】:
Ambari参数优化汇总 - 图6
org.apache.ambari.server.configuration.Configuration.java
@Markdown(
relatedTo = “server.jdbc.connection-pool”,
description = “The minimum number of connections that should always exist in the database connection pool. Only used with c3p0.”)
public static final ConfigurationProperty SERVER_JDBC_CONNECTION_POOL_MIN_SIZE = new ConfigurationProperty<>(
server.jdbc.connection-pool.min-size“, 5);

@Markdown(
relatedTo = “server.jdbc.connection-pool”,
description = “The maximum number of connections that should exist in the database connection pool. Only used with c3p0.”)
public static final ConfigurationProperty SERVER_JDBC_CONNECTION_POOL_MAX_SIZE = new ConfigurationProperty<>(
server.jdbc.connection-pool.max-size“, 32);

@Markdown(
relatedTo = “server.jdbc.connection-pool”,
description = “The maximum amount of time, in seconds, that an idle connection can remain in the pool. “
+ “This should always be greater than the value returned from server.jdbc.connection-pool.max-idle-time-excess. Only used with c3p0.”)
public static final ConfigurationProperty SERVER_JDBC_CONNECTION_POOL_MAX_IDLE_TIME = new ConfigurationProperty<>(
server.jdbc.connection-pool.max-idle-time“, 14400);

@Markdown(
relatedTo = “server.jdbc.connection-pool”,
description = “The number of seconds in between testing each idle connection in the connection pool for validity. Only used with c3p0.”)
public static final ConfigurationProperty SERVER_JDBC_CONNECTION_POOL_IDLE_TEST_INTERVAL = new ConfigurationProperty<>(
server.jdbc.connection-pool.idle-test-interval“, 7200);
【参数说明】:
Jdbc连接池和mysql数据库连接参数
【建议值】:待定
server.jdbc.connection-pool.min-size=500
server.jdbc.connection-pool.max-size=1500
/etc/my.cnf
max_connections=3000
interactive_timeout=28800
wait_timeout=28800
【备注】:
如果使用 MySQL 作为 Ambari 数据库,在 MySQL 配置中,提升 wait_timeout 和 interacitve_timeout 为 8 hours (28800), max_connections 值从 32 提升到 128.
Ambari参数优化汇总 - 图7
Ambari参数优化汇总 - 图8
重要提示:————————————————————————————————————————————————————————————————-
Ambari 配置 server.jdbc.connectionpool.max-idle-time 和 server.jdbc.connection-pool.idle-test-interval 必须低于 MySQL 侧的wait_timeout 和 interactive_timeout 设置,这是至关重要的。如果选择降低这些超时值,在 Ambari 的配置中相应地调整server.jdbc.connection-pool.max-idle-time 和 server.jdbc.connectionpool.idle-test-interval, 这样他们就低于 wait_timeout 和interactive_timeout; Mysql参数可以通过修改/etc/my.cnf进行配置;
max_connections这个参数实际起作用的最大值(实际最大可连接数)为16384,即该参数最大值不能超过16384,即使超过也以16384为准;增加max_connections参数的值,不会占用太多系统资源。系统资源(CPU、内存)的占用主要取决于查询的密度、效率等;

4、√对象实例缓存大小server.ecCacheSize

【默认值】:
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “The size of the cache which is used to hold current operations in memory until they complete.”)
public static final ConfigurationProperty SERVER_EC_CACHE_SIZE = new ConfigurationProperty<>(“server.ecCacheSize”, 10000L);
Ambari参数优化汇总 - 图9
【参数说明】:
The size of the cache used to hold {@link HostRoleCommand} instances in-memory.
【建议值】:
server.ecCacheSize=80000
【备注】:
湖北现场环境有1300多个节点
ecCacheSizeValue = 60 * cluster_size

5、√升级分批并行任务数stack.upgrade.default.parallelism

【默认值】:
stack.upgrade.default.parallelism=100
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “Default value of max number of tasks to schedule in parallel for upgrades. Upgrade packs can override this value.”)
public static final ConfigurationProperty DEFAULT_MAX_DEGREE_OF_PARALLELISM_FOR_UPGRADES = new ConfigurationProperty<>(
“stack.upgrade.default.parallelism”, 100);
Ambari参数优化汇总 - 图10
【参数说明】:
Default value of Max number of tasks to schedule in parallel for upgrades.
【建议值】:
stack.upgrade.default.parallelism=300
【备注】:待验证
// Add stages where the restart stages are ordered
// E.g., preupgrade, restart hosts(0), …, restart hosts(n-1), postupgrade
void org.apache.ambari.server.state.stack.upgrade.Grouping.DefaultBuilder.add(UpgradeContext context, HostsType hostsType, String service, boolean clientOnly, ProcessingComponent pc, Map params)

void
org.apache.ambari.server.state.stack.upgrade.Grouping.DefaultBuilder.addTasksToStageInBatches(List tasks, String verb, UpgradeContext ctx, String service, ProcessingComponent pc, Map params)
Ambari参数优化汇总 - 图11
String org.apache.ambari.server.state.stack.upgrade.StageWrapperBuilder.getStageText(String prefix, String component, Set hosts, int batchNum, int totalBatches)
Ambari参数优化汇总 - 图12

Ambari参数优化汇总 - 图13

Ambari参数优化汇总 - 图14

6、√agent.package.parallel.commands.limit

【默认值】:
agent.package.parallel.commands.limit=100
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “The maximum number of tasks which can run within a single operational request. If there are more tasks, then they will be broken up between multiple operations.”)
public static final ConfigurationProperty AGENT_PACKAGE_PARALLEL_COMMANDS_LIMIT = new ConfigurationProperty<>(
“agent.package.parallel.commands.limit”, 100);
【参数说明】:
The maximum number of tasks which can run within a single operational request. If there are more tasks, then they will be broken up between multiple operations.
【建议值】:建议和stack.upgrade.default.parallelism保持一致
agent.package.parallel.commands.limit=300
【备注】:
RequestStatus
org.apache.ambari.server.controller.internal.ClusterStackVersionResourceProvider.createOrUpdateHostVersions(Cluster cluster, RepositoryVersionEntity repoVersionEntity, VersionDefinitionXml
versionDefinitionXml, StackId stackId, boolean forceInstalled, Map propertyMap)
Ambari参数优化汇总 - 图15
RequestStageContainer
org.apache.ambari.server.controller.internal.ClusterStackVersionResourceProvider.createOrchestration(Cluster cluster, StackId stackId, List hosts, RepositoryVersionEntity repoVersionEnt,
VersionDefinitionXml desiredVersionDefinition, Map propertyMap)
Ambari参数优化汇总 - 图16

7、agent.task.timeout

【默认值】:
agent.task.timeout =900L
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “The time, in seconds, before agent commands are killed. This does not include package installation commands.”)
public static final ConfigurationProperty AGENT_TASK_TIMEOUT = new ConfigurationProperty<>(“agent.task.timeout“, 900L);
【参数说明】:任务默认超时参数
The time, in seconds, before agent commands are killed. This does not include package installation commands.
【建议值】:
采用默认值
【备注】:
不建议修改,除非执行机性能低下确实不能满足正常的执行时长
void org.apache.ambari.server.controller.AmbariActionExecutionHelper.addExecutionCommandsToStage(Action ExecutionContext actionContext, Stage stage, Map requestParams, boolean checkHostIsMemberOfCluster) throws AmbariException
Ambari参数优化汇总 - 图17
Short org.apache.ambari.server.state.stack.upgrade.StageWrapper.getMaxTimeout(Configuration
configuration)
Ambari参数优化汇总 - 图18
// 任务超时时间设置 - 没有配置的话采用任务默认超时时间
String org.apache.ambari.server.state.stack.upgrade.Task.timeoutConfig
对应/var/lib/ambari-server/resources//.xml的task节点属性timeout-config

8、agent.package.install.task.timeout

【默认值】:
agent.package.install.task.timeout =1800L
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “The time, in seconds, before package installation commands are killed.”)
public static final ConfigurationProperty AGENT_PACKAGE_INSTALL_TASK_TIMEOUT = new ConfigurationProperty<>(“agent.package.install.task.timeout“, 1800L);
【参数说明】:
The time, in seconds, before package installation commands are killed.
【建议值】:
采用默认值
【备注】:
不建议修改,除非执行机性能低下确实不能满足正常的执行时长
org.apache.ambari.server.controller.AmbariManagementControllerImpl.createAndPersistStages()
org.apache.ambari.server.controller.AmbariManagementControllerImpl.addStages(…)
org.apache.ambari.server.controller.AmbariManagementControllerImpl.doStageCreation(…)
Ambari参数优化汇总 - 图19
void org.apache.ambari.server.controller.AmbariManagementControllerImpl.createHostAction(
Cluster cluster,
Stage stage,
ServiceComponentHost scHost,
Map> configurations,
Map>> configurationAttributes,
Map> configTags,
RoleCommand roleCommand,
Map commandParamsInp,
ServiceComponentHostEvent event,
boolean skipFailure,
RepositoryVersionEntity repoVersion,
boolean isUpgradeSuspended,
DatabaseType databaseType,
Map clusterDesiredConfigs,
boolean useLatestConfigs) throws AmbariException
Ambari参数优化汇总 - 图20

Ambari参数优化汇总 - 图21

8、√ server.hrcStatusSummary.cache.size

【默认值】:
server.hrcStatusSummary.cache.size=10000L
org.apache.ambari.server.configuration.Configuration.java
@Markdown(
relatedTo = “server.hrcStatusSummary.cache.enabled”,
description = “The size of the cache which is used to hold a status of every operation in a request.”)
public static final ConfigurationProperty SERVER_HRC_STATUS_SUMMARY_CACHE_SIZE = new ConfigurationProperty<>(“server.hrcStatusSummary.cache.size”, 10000L);
【参数说明】:
The size of the cache which is used to hold a status of every operation in a request.
【建议值】:
server.hrcStatusSummary.cache.size=80000L
【备注】:
org.apache.ambari.server.controller.ControllerModule
Ambari参数优化汇总 - 图22
org.apache.ambari.server.orm.dao.HostRoleCommandDAO.HostRoleCommandDAO(
@Named(value=HRC_STATUS_SUMMARY_CACHE_ENABLED)
boolean hostRoleCommandStatusSummaryCacheEnabled,
@Named(value=HRC_STATUS_SUMMARY_CACHE_SIZE)
long hostRoleCommandStatusSummaryCacheLimit,
@Named(value=HRC_STATUS_SUMMARY_CACHE_EXPIRY_DURATION_MINUTES)
long hostRoleCommandStatusSummaryCacheExpiryDurationMins)
Ambari参数优化汇总 - 图23

10、√subscription.registry.cache.size

【默认值】:
subscription.registry.cache.size=1500
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “Maximal cache size for spring subscription registry.”)
public static final ConfigurationProperty SUBSCRIPTION_REGISTRY_CACHE_MAX_SIZE = new ConfigurationProperty<>(
“subscription.registry.cache.size”, 1500);
【参数说明】:
Maximal cache size for spring subscription registry.
【建议值】:
subscription.registry.cache.size=51200
【备注】:
void org.apache.ambari.server.configuration.spring.RootStompConfig.configureRegistryCacheSize(
SimpleBrokerMessageHandler simpleBrokerMessageHandler)
Ambari参数优化汇总 - 图24
org.apache.ambari.server.agent.stomp.AmbariSubscriptionRegistry
extends AbstractSubscriptionRegistry
Ambari参数优化汇总 - 图25
Ambari参数优化汇总 - 图26

11、√stomp agent注册相关参数

【默认值】:
registration.threadpool.size=10
agents.registration.queue.size=200
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “Thread pool size for agents registration”)
public static final ConfigurationProperty REGISTRATION_THREAD_POOL_SIZE = new ConfigurationProperty<>(“registration.threadpool.size”, 10);

@Markdown(description = “Queue size for agents in registration.”)
public static final ConfigurationProperty AGENTS_REGISTRATION_QUEUE_SIZE = new ConfigurationProperty<>(
“agents.registration.queue.size”, 200);
【参数说明】:
Stomp agent注册相关的线程池大小&队列大小
【建议值】:
registration.threadpool.size=100
agents.registration.queue.size=2000
【备注】:
org.apache.ambari.server.agent.stomp.AgentsRegistrationQueue
Ambari参数优化汇总 - 图27
org.apache.ambari.server.agent.stomp.HeartbeatController.HeartbeatController(…)
Ambari参数优化汇总 - 图28

12、√ agents.reports.thread.pool.size

【默认值】:
agents.reports.thread.pool.size=10
@Markdown(description = “Thread pool size for agents reports processing.”)
public static final ConfigurationProperty AGENTS_REPORT_THREAD_POOL_SIZE = new ConfigurationProperty<>(“agents.reports.thread.pool.size”, 10);
【参数说明】:
Thread pool size for agents reports processing.
【建议值】:
agents.reports.thread.pool.size=1500
【备注】:
org.apache.ambari.server.agent.AgentReportsProcessor
Ambari参数优化汇总 - 图29
org.apache.ambari.server.agent.stomp.AgentReportsController
Ambari参数优化汇总 - 图30
ambari-agent.log样例日志
Ambari参数优化汇总 - 图31

13、√messaging.threadpool.size

【默认值】:
messaging.threadpool.size=10
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “Thread pool size for spring messaging”)
public static final ConfigurationProperty MESSAGING_THREAD_POOL_SIZE = new ConfigurationProperty<>(“messaging.threadpool.size“, 10);
【参数说明】:
The thread pool size for spring messaging.
【建议值】: 待定
messaging.threadpool.size=256
【备注】:
messaging.threadpool.size=cpuCoreSize * 2 //一般默认建议值
湖北server环境为64核
org.apache.ambari.server.configuration.spring.AgentStompConfig
extends AbstractWebSocketMessageBrokerConfigurer
configureClientInboundChannel - 用于传递从WebSocket客户端接收到的消息
configureClientOutboundChannel - 用于向WebSocket客户端发送服务器消息
Ambari参数优化汇总 - 图32
参考资料:
https://blog.csdn.net/qq_32331073/article/details/83014829

14、√*.api.acceptor.count

【默认值】:
private static final int DEFAULT_ACCEPTORS_COUNT = 1;
org.apache.ambari.server.controller.AmbariServer.java
private static final int DEFAULT_ACCEPTORS_COUNT = 1;
// Client Jetty thread pool - widen the thread pool if needed !
Integer clientAcceptors = configs.getClientApiAcceptors() != null ? configs
.getClientApiAcceptors() : DEFAULT_ACCEPTORS_COUNT;
// Agent Jetty thread pool - widen the thread pool if needed !
Integer agentAcceptors = configs.getAgentApiAcceptors() != null ? configs
.getAgentApiAcceptors() : DEFAULT_ACCEPTORS_COUNT;
ServerConnector agentOneWayConnector =
createSelectChannelConnectorForAgent(serverForAgent, configs.getOneWayAuthPort(), false, agentAcceptors);
ServerConnector agentTwoWayConnector =
createSelectChannelConnectorForAgent(serverForAgent, configs.getTwoWayAuthPort(), configs.isTwoWaySsl(), agentAcceptors);
org.apache.ambari.server.configuration.Configuration.java
@Markdown(description = “Count of acceptors to configure for the jetty connector used for Ambari agent.”)
public static final ConfigurationProperty SRVR_AGENT_ACCEPTOR_THREAD_COUNT = new ConfigurationProperty<>(“agent.api.acceptor.count“, null);

@Markdown(description = “Count of acceptors to configure for the jetty connector used for Ambari API.”)
public static final ConfigurationProperty SRVR_API_ACCEPTOR_THREAD_COUNT = new ConfigurationProperty<>(“client.api.acceptor.count“, null);
【参数说明】:
The number of acceptor threads for the api/agent jetty connector.
【建议值】:
client.api.acceptor.count=8
agent.api.acceptor.count=8
【备注】:
初步确认湖北现场环境>=24核
jetty-server高性能,多线程特性的源码分析
https://www.jianshu.com/p/695be1d07c38
LOG.warn(“Acceptors should be <= availableProcessors”);

15、√client.threadpool.size.max

【默认值】:
Ambari参数优化汇总 - 图33

org.apache.ambari.server.configuration.Configuration.java
@ConfigurationMarkdown(
group = ConfigurationGrouping.JETTY_THREAD_POOL,
scaleValues = {
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_10, value = “25”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_50, value = “35”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_100, value = “50”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_500, value = “65”) },
markdown = @Markdown(description = “The size of the Jetty connection pool used for handling incoming REST API requests. This should be large enough to handle requests from both web browsers and embedded Views.”))
public static final ConfigurationProperty CLIENT_THREADPOOL_SIZE = new ConfigurationProperty<>(
client.threadpool.size.max“, 25);
【参数说明】:
The max size of the Jetty connection pool used for handling incoming REST API requests.
【建议值】:
client.threadpool.size.max=300
【备注】:
湖北现场环境有1300多个节点

16、√client.threadpool.size.min

【默认值】:
无,代码优化新增参数
【参数说明】:
The min size of the Jetty connection pool used for handling incoming REST API requests.
【建议值】:
client.threadpool.size.min=150
【备注】:
新增此配置项,并修改相关代码
org.apache.ambari.server.configuration.Configuration.java
/*
The size of the Jetty connection pool used for handling incoming REST API requests.
*/
public static final ConfigurationProperty CLIENT_THREADPOOL_MIN_SIZE = new ConfigurationProperty<>(
“client.threadpool.size.min”, 25);

org.apache.ambari.server.controller.AmbariServer.java
protected Server configureJettyThreadPool(int acceptorThreads, String threadPoolName, int configuredThreadPoolSize)
QueuedThreadPool qtp = new QueuedThreadPool(maxPoolSize, minPoolSize);

17、√agent.threadpool.size.max

【默认值】:
org.apache.ambari.server.configuration.Configuration.java
@ConfigurationMarkdown(
group = ConfigurationGrouping.JETTY_THREAD_POOL,
scaleValues = {
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_10, value = “25”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_50, value = “35”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_100, value = “75”),
@ClusterScale(clusterSize = ClusterSizeType.HOSTS_500, value = “100”) },
markdown = @Markdown(description = “The size of the Jetty connection pool used for handling incoming Ambari Agent requests.”))
public static final ConfigurationProperty AGENT_THREADPOOL_SIZE = new ConfigurationProperty<>(“agent.threadpool.size.max“, 25);
【参数说明】:
The size of the Jetty connection pool used for handling incoming Ambari Agent requests.
【建议值】:
agent.threadpool.size.max=1300
【备注】:
湖北现场环境有1300多个节点

18、√agent.threadpool.size.min

【默认值】:
无,代码优化新增参数
【参数说明】:
The min size of the Jetty connection pool used for handling incoming Ambari Agent requests.
【建议值】:
agent.threadpool.size.min=500
【备注】:
新增此配置项,并修改相关代码
湖北现场环境有1300多个节点

19、√心跳报文调度线程池参数

【默认值】:
org.apache.ambari.server.agent.HeartbeatProcessor.java
//TODO rewrite to correlate with heartbeat frequency, hardcoded in agent as of now
private long delay = 5000;
private long period = 1000;
private int poolSize = 1;
Ambari参数优化汇总 - 图34
If any execution of this task takes longer than its period, then subsequent executions may start late, but will not concurrently execute.
如果某个任务执行时长超过调度周期,则后续调度作业会延迟启动(等待前面任务执行完毕)
【参数说明】:
延迟时间、执行周期、线程池大小
【建议值】:
client.heartbeat.schedule.delay=3000
client.heartbeat.schedule.period=500
client.heartbeat.schedule.poolsize=200
【备注】:
新增此配置项,并修改相关代码
把这三个硬编码的参数提到server配置文件中,方便后面维护
client.heartbeat.schedule.delay=3000
client.heartbeat.schedule.period=500
client.heartbeat.schedule.poolsize=200

20、√HeartBeatHandler初始缓存map大小

【默认值】:
org.apache.ambari.server.agent.HeartBeatHandler.java
private Map hostResponseIds = new ConcurrentHashMap<>();
private Map hostResponses = new ConcurrentHashMap<>();
【参数说明】:
心跳报文初始缓存大小
【建议值】:
2048 —— 2048=tableSizeFor(1300)
【备注】:
修改代码 – 指定初始大小
ConcurrentHashMap默认大小为16,对于湖北现场环境来说初始大小太小,会不断的触发扩容机制

21、server.execution.scheduler.wait

【默认值】: 1秒
org.apache.ambari.server.actionmanager.ActionScheduler.java
public ActionScheduler(@Named(“schedulerSleeptime“) long sleepTime,
@Named(“actionTimeout“) long actionTimeout,
ActionDBAccessor db, JPAEventPublisher jpaPublisher) {
this.sleepTime = sleepTime;
this.actionTimeout = actionTimeout;
… …
}
org.apache.ambari.server.controller.ControllerModule.java
bindConstant().annotatedWith(Names.named(“schedulerSleeptime“)).to(
configuration.getExecutionSchedulerWait());
bindConstant().annotatedWith(Names.named(“actionTimeout“)).to(600000L);
org.apache.ambari.server.configuration.Configuration.java
public static final ConfigurationProperty EXECUTION_SCHEDULER_WAIT = new ConfigurationProperty<>(“server.execution.scheduler.wait“, 1L);
【参数说明】:
schedulerSleeptime对应ActionScheduler的执行周期,线程的空闲等待时长
Ambari参数优化汇总 - 图35
【建议值】:
暂时保持默认配置,如后续有需要可以修改代码进而支持浮点型(毫秒级)
【备注】:
此配置项的单位是秒,内部代码会转换为毫秒
1<= sleepTime <=60 (最大执行周期一分钟,不支持浮点型)
org.apache.ambari.server.configuration.Configuration.getExecutionSchedulerWait()
Ambari参数优化汇总 - 图36

22、硬编码参数actionTimeout

【默认值】: 600000L **—— 10分钟
org.apache.ambari.server.actionmanager.ActionScheduler.java
public ActionScheduler(@Named(“
schedulerSleeptime“) long sleepTime,
@Named(“
actionTimeout“) long actionTimeout,
ActionDBAccessor db, JPAEventPublisher jpaPublisher) {
this.
sleepTime = sleepTime;
this.
actionTimeout = actionTimeout;
… …
}
org.apache.ambari.server.controller.ControllerModule.java
bindConstant().annotatedWith(Names.named(“
actionTimeout“)).to(600000L);
【参数说明】:
List firstStageInProgressPerRequest = db.getFirstStageInProgressPerRequest();
publishInProgressTasks(firstStageInProgressPerRequest);
int i_stage = 0;
// get the range of requests in progress
long iLowestRequestIdInProgress = firstStageInProgressPerRequest.get(0).getRequestId();
long iHighestRequestIdInProgress = firstStageInProgressPerRequest.get(
firstStageInProgressPerRequest.size() - 1).getRequestId();

List hostsWithPendingTasks = hostRoleCommandDAO.getHostsWithPendingTasks(
iLowestRequestIdInProgress, iHighestRequestIdInProgress);
// filter the stages in progress down to those which can be scheduled in parallel
List stages = filterParallelPerHostStages(firstStageInProgressPerRequest);
Ambari参数优化汇总 - 图37

Ambari参数优化汇总 - 图38

/etc/ambari-server/conf/ambari.properties
custom.action.definitions=/var/lib/ambari-server/resources/custom_action_definitions
Ambari参数优化汇总 - 图39
Ambari参数优化汇总 - 图40
【建议值】:
暂时保持默认配置,后续可以提取到配置文件中
【备注】:
事件响应超时的检测方法: 轮训时间点 >= 事件最后的发送时间 +
actionTimeout
org.apache.ambari.server.actionmanager.ActionScheduler.java
boolean org.apache.ambari.server.actionmanager.ActionScheduler.timeOutActionNeeded()
Ambari参数优化汇总 - 图41

/

This method processes command timeouts and retry attempts, and
adds new (pending) execution commands to commandsToSchedule list.
@return the stats for the roles in the stage which are used to determine
whether stage has succeeded or failed
*/
Map
org.apache.ambari.server.actionmanager.ActionScheduler.processInProgressStage(Stage s,
List commandsToSchedule, Multimap
commandsToEnqueue)
if (timeOutActionNeeded(status, s, hostObj, roleStr, now, commandTimeout) {}
db.timeoutHostRole(host, s.getRequestId(), s.getStageId(), c.getRole(), isSkipSupported,
isHostStateUnknown);
}

void org.apache.ambari.server.actionmanager.ActionScheduler.doWork()
Ambari参数优化汇总 - 图42

23、心跳监控线程参数threadWakeupInterval

【默认值】: 1分钟
Ambari参数优化汇总 - 图43
Ambari参数优化汇总 - 图44
Ambari参数优化汇总 - 图45
Ambari参数优化汇总 - 图46
【参数说明】:
心跳监控线程的执行周期 & 心跳周期
【建议值】:
暂时保持默认配置
【备注】:
此处参数最好和agent心跳上报周期对应

24、√Quartz作业调度框架相关配置参数

Ambari参数优化汇总 - 图47
【默认值】:
Ambari参数优化汇总 - 图48
Properties org.apache.ambari.server.scheduler.ExecutionSchedulerImpl.getQuartzSchedulerProperties()
properties.setProperty(“org.quartz.threadPool.threadCount”,
configuration.getExecutionSchedulerThreads());
properties.setProperty(“org.quartz.jobStore.isClustered”,
configuration.isExecutionSchedulerClusterd());
properties.setProperty(“org.quartz.dataSource.myDS.maxConnections”,
configuration.getExecutionSchedulerConnections());
properties.setProperty(“org.quartz.dataSource.myDS.maxCachedStatementsPerConnection” ,configuration.getExecutionSchedulerMaxStatementsPerConnection());
Ambari参数优化汇总 - 图49
org.apache.ambari.server.configuration.Configuration.java
/
Determines whether the Quartz rolling restart jobstore is clustered.
/
@Markdown(description = “Determines whether Quartz will use a clustered job scheduled when performing scheduled actions like rolling restarts.”)
public static final ConfigurationProperty EXECUTION_SCHEDULER_CLUSTERED = new ConfigurationProperty<>(
server.execution.scheduler.isClustered“, “false”);

/

The number of threads that the Quartz job scheduler will use.
/
@Markdown(description = “The number of threads that the Quartz job scheduler will use when executing scheduled jobs.”)
public static final ConfigurationProperty EXECUTION_SCHEDULER_THREADS = new ConfigurationProperty<>(
server.execution.scheduler.maxThreads“, “5”);

/
The number of concurrent database connections that the Quartz job scheduler can use.
/
@Markdown(description = “The number of concurrent database connections that the Quartz job scheduler can use.”)
public static final ConfigurationProperty EXECUTION_SCHEDULER_CONNECTIONS = new ConfigurationProperty<>(
server.execution.scheduler.maxDbConnections“, “5”);

/

The maximum number of prepared statements cached per database connection.
/
@Markdown(description = “The maximum number of prepared statements cached per database connection.”)
public static final ConfigurationProperty EXECUTION_SCHEDULER_MAX_STATEMENTS_PER_CONNECTION = new ConfigurationProperty<>(
server.execution.scheduler.maxStatementsPerConnection“, “120”);

/
The delay, in {@link TimeUnit#SECONDS}, that a Quartz job must wait before it starts.
/
@Markdown(description = “The delay, in seconds, that a Quartz job must wait before it starts.”)
public static final ConfigurationProperty EXECUTION_SCHEDULER_START_DELAY = new ConfigurationProperty<>(
server.execution.scheduler.start.delay.seconds“, 120);

public String isExecutionSchedulerClusterd() {
return getProperty(EXECUTION_SCHEDULER_CLUSTERED);
}
public String getExecutionSchedulerThreads() {
return getProperty(EXECUTION_SCHEDULER_THREADS);
}
public String getExecutionSchedulerConnections() {
return getProperty(EXECUTION_SCHEDULER_CONNECTIONS);
}
public String getExecutionSchedulerMaxStatementsPerConnection() {
return getProperty(EXECUTION_SCHEDULER_MAX_STATEMENTS_PER_CONNECTION);
}
【参数说明】:
Ambari参数优化汇总 - 图50
/

This class handles scheduling request execution for managed clusters
/
org.apache.ambari.server.scheduler.ExecutionScheduleManager.java
Ambari参数优化汇总 - 图51
Ambari参数优化汇总 - 图52
【建议值】:
server.execution.scheduler.maxThreads=50
server.execution.scheduler.maxDbConnections=50
server.execution.scheduler.maxStatementsPerConnection=200
【备注】:
server.execution.scheduler.isClustered=false 该参数不建议改动
通过设置”org.quartz.jobStore.isClustered”属性为”true”来激活集群特性。在集群中的每一个实例都必须有一 个唯一的”instance id” (“org.quartz.scheduler.instanceId” 属性), 但是应该有相同的”scheduler instance name” (“org.quartz.scheduler.instanceName”),也就是说集群中的每一个实例都必须使用相同的 quartz.properties 配置文件。除了以下几种例外,配置文件的内容其他都必须相同:
 不同的线程池大小, 不同的”org.quartz.scheduler.instanceId”属性值(这个可以很容易做到,设定为”AUTO”即可)。 注意: 永远不要在不同的机器上运行集群,除非他们的时钟是使用某种形式的同步服务(守护)非常有规律的运行(时钟必须在一分一秒内)来达到同步。还有: 永远不要触发一个非集群的实例,如果其他的实例正在同一个数据库表上运行。你将使你的数据严重腐蚀,出现非预期行为。

25、Kerberos配置参数

/
The time, in {@link TimeUnit#MINUTES}, that the temporary, in-memory
credential store retains values.
*/
@Markdown(description = “The time, in minutes, that the temporary, in-memory credential store retains values.”)
public static final ConfigurationProperty TEMPORARYSTORE_RETENTION_MINUTES = new ConfigurationProperty<>(
“security.temporary.keystore.retention.minutes”, 90L);

Ambari参数优化汇总 - 图53
Ambari参数优化汇总 - 图54
Ambari参数优化汇总 - 图55
Ambari参数优化汇总 - 图56
Ambari参数优化汇总 - 图57
Ambari参数优化汇总 - 图58

/

Determines whether the temporary keystore should have keys actively purged
on a fixed internal, or only when requested after expiration.
*/
@Markdown(description = “Determines whether the temporary keystore should have keys actively purged on a fixed internal. or only when requested after expiration.”)
public static final ConfigurationProperty TEMPORARYSTORE_ACTIVELY_PURGE = new ConfigurationProperty<>(
“security.temporary.keystore.actibely.purge”, Boolean.TRUE);

Ambari参数优化汇总 - 图59

26、免登录鉴权的配置

The username of the default user assumed to be executing API calls. When set, authentication is not required in order to login to Ambari or use the REST APIs.
Ambari参数优化汇总 - 图60
/etc/ambari-server/conf/ambari.properties
api.authenticated.user=admin 加上这个配置参数是可以直接跳转的主界面的


ambari-release-AMBARI-2.7.3.0-139\ambari-server\src\main\java\org\apache\ambari\server\security\authorization\AmbariAuthorizationFilter.java
/
Creates the default Authentication if a default user is configured
@return an Authentication representing the default user
*/
private Authentication getDefaultAuthentication()

ambari-release-AMBARI-2.7.3.0-139\ambari-server\src\main\java\org\apache\ambari\server\configuration\Configuration.java
/

The username of the default user assumed to be executing API calls. When
set, authentication is not required in order to login to Ambari or use the
REST APIs.
/
@Markdown(description = “The username of the default user assumed to be executing API calls. When set, authentication is not required in order to login to Ambari or use the REST APIs. “)
public static final ConfigurationProperty API_AUTHENTICATED_USER = new ConfigurationProperty<>(“api.authenticated.user”, null);

ambari-release-AMBARI-2.7.3.0-139\ambari-server\src\main\java\org\apache\ambari\server\api\services\users\UserAuthorizationService.java

27、修改Log4j日志级别DEBUG

【默认值】: INFO
/etc/ambari-server/conf/log4j.properties
log4j.rootLogger=INFO,file
【参数说明】:
Log4j日志级别
【建议值】:
log4j.rootLogger=DEBUG,file
【备注】:
INFO级别的日志信息很少不利于升级过程中的问题分析、定位,因此建议修改为DEBUG

Ambari Agent配置参数优化

Ambari Agent配置文件/etc/ambari-agent/conf/ambari-agent.ini

1、HeartbeatThread心跳上报周期

【默认值】:
Ambari参数优化汇总 - 图61
Ambari参数优化汇总 - 图62
【参数说明】:
Ambari Agent心跳报文上报周期,单位秒
【建议值】:
暂时保持默认值
【备注】:
暂无

2、HeartbeatThread心跳响应超时时间

【默认值】:
Ambari参数优化汇总 - 图63
Ambari参数优化汇总 - 图64
【参数说明】:
Ambari Agent心跳响应超时时间,单位秒
【建议值】:
暂时保持默认值
【备注】:
暂无

3、Ambari Agent各种状态上报线程的执行周期

【默认值】:
ambari_agent\AmbariConfig.py
# 命令执行上报周期
@property

def command_reports_interval(self):

return int(self.get(‘agent’, ‘command_reports_interval’, default=’5’))

告警上报周期
@propertydef alert_reports_interval(self):
return int(self.get(‘agent’, ‘alert_reports_interval’, default=’5’))

组件状态上报周期
@propertydef status_commands_run_interval(self):
return int(self.get(‘agent’, ‘status_commands_run_interval’, default=’20’))

主机状态上报周期
@property

def host_status_report_interval(self):

return int(self.get(‘heartbeat’, ‘state_interval_seconds’, ‘60’))
【参数说明】:
Ambari Agent各种状态上报线程的执行周期,单位秒
【建议值】:
暂时保持默认值
【备注】:
暂无

4、parallel_execution

【默认值】:
Ambari参数优化汇总 - 图65
ambari_agent\AmbariConfig.py
def get_parallel_exec_option(self):

return int(self.get(‘agent’, ‘parallel_execution’, 0))
Ambari参数优化汇总 - 图66Ambari参数优化汇总 - 图67
【参数说明】:
对于支持重试的命令是否开启多线程并行处理模式
【建议值】:
暂时保持默认值
【备注】:
从目前来看压力主要在server端,如果server端下发的command执行事件响应超时,可以尝试调整此参数

5、TODO 修改python日志级别DEBUG

【默认值】: INFO
【参数说明】:
【建议值】:
【备注】:
INFO级别的日志信息很少不利于升级过程中的问题分析、定位,因此建议修改为DEBUG