Tez - TEZ UI issue - 《大数据运维》

TEZ UI issue

tez.history.logging.service.class
org.apache.tez.dag.history.logging.ats.ATSV15HistoryLoggingService 改为
org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService（TEZ UI页面有些项目无法项目）
参数调节

hive.execution.engine=tez;
更换执行引擎
hive.auto.convert.join=true;
开启mapjoin
hive.auto.convert.join.noconditionaltask=true;
小数据集多个mapjoin合并
hive.mapjoin.smalltable.filesize=25000000;
小数据集多个mapjoin合并（触发条件）
hive.auto.convert.join.noconditionaltask.size=60000000;
30% of hive.tez.container.size 也受限制与Hadoop maximum Java heap size大小和文件是否压缩（orc file 得除以10，对于mapjoin out of memory 减低值）
hive.tez.container.size=4096

tez.am.resource.memory.mb=8192
设置不能太小，am container 不停full gc，最后报错终止
tez.runtime.io.sort.mb=512
40% of hive.tez.container.size
tez.runtime.unordered.output.buffer.size-mb=400
10% of hive.tez.container.size
tez.grouping.min-size
数据分片的大小（最小值），控制map数
tez.grouping.max-size
数据分片的大小（最大值），控制map数
tez.session.am.dag.submit.timeout.secs=10
am等待DAG提交的超时时间，dacp 一个流程的多个任务使用不用的session 所以这个值应该设置的较小，避免资源浪费
hive.tez.auto.reducer.parallelism=true;

hive.exec.reducers.bytes.per.reducer
256000000，可以适当调小
hive.tez.min.partition.factor=0.05;
可以让下探到最低，小与最大reducer 1099，可以适当调小
hive.tez.max.partition.factor=2.0;
Hive/ Tez estimates number of reducers
Max(1, Min(hive.exec.reducers.max [1099], ReducerStage estimate/hive.exec.reducers.bytes.per.reducer)) x hive.tez.max.partition.factor [2]，可以适当调大
hive.prewarm.enabled

hive.prewarm.numcontainers
每个am启动使用预启动container数，eg：3, 一个session打开将启动4个container
tez.shuffle-vertex-manager.min-src-fraction=0.25;

tez.shuffle-vertex-manager.max-src-fraction=0.75;
This indicates that the decision will be made between 25% of mappers finishing and 75% of mappers finishing, provided there’s at least 1Gb of data being output (i.e if 25% of mappers don’t send 1Gb of data, we will wait till at least 1Gb is sent out)
tez.am.container.idle.release-timeout-min.millis

tez.am.container.idle.release-timeout-max.millis
Int value. The maximum amount of time to hold on to a container if no task can be assigned to it imm