问题背景
- spark thrift server本地日志文件过大
- yarn applicationMaster redirect to spark
- executors
- jdbc thrift server通过${SPARK_HOME}/sbin/start-thriftserver.sh —master yarn-client启动
分析定位
- 本地日志应该是通过重定向yarn-client 任务的输出到文件的
- vim ${SPARK_HOME}/sbin/start-thriftserver.sh
- 没有相关信息,继续调用 ${SPARK_HOME}/sbin/spark-daemon.sh
vim ${SPARK_HOME}/sbin/spark-daemon.sh
# line 128execute_command() {if [ -z ${SPARK_NO_DAEMONIZE+set} ]; thennohup -- "$@" >> $log 2>&1 < /dev/null &newpid="$!"echo "$newpid" > "$pid"# Poll for up to 5 seconds for the java process to startfor i in {1..10}doif [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; thenbreakfisleep 0.5donesleep 2# Check if the process has died; in that case we'll tail the log so the user can seeif [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; thenecho "failed to launch: $@"tail -10 "$log" | sed 's/^/ /'echo "full log in $log"fielse"$@"fi}
解决方案
- yarn-client 任务的输出按照日志分类输出到文件
关于本地重定向日志,可以直接重定向到/dev/null
$SPARK_HOME/sbin/start-thriftserver.sh --name "XXX Thrift Server" --master yarn-client --queue xxx --num-executors 2 --conf spark.driver.memory=10g --executor-memory 6g --conf spark.executor.memoryOverhead=2048 \--conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --hiveconf hive.default.fileformat=parquet \--files "$SPARK_HOME/xxx_conf/log4j.properties" \--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties \--conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties
在log4j.properties里面,修改配置为将日志定向到文件里,去掉原本输出到Console的配置。为避免同一个executor的不同job同时写一直日志文件的现象,需要将日志文件的输出路径指定为spark.yarn.app.container.log.dir。
这样,因为不同的任务使用的不同container,将会动态的创建日志到当前任务的container目录下,日志的输出和原来的stdout和stderr一样的效果。
- $SPARK_HOME/conf/log4j.properties ```basic log4j.rootLogger =INFO,stdout,I,E
output to console
log4j.appender.stdout = org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target = System.out log4j.appender.stdout.layout = org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern = %-d{yyyy-MM-dd HH:mm} %5p %t %c{2}:%L - %m%n
output error to files
log4j.appender.E=org.apache.log4j.DailyRollingFileAppender log4j.appender.E.layout=org.apache.log4j.PatternLayout log4j.appender.E.layout.conversionPattern=%-d{yyyy-MM-dd HH:mm:ss} %5p %t %c{2}:%L - %m%n log4j.appender.E.maxFileSize=100MB log4j.appender.E.maxBackupIndex=5 log4j.appender.E.Append = true log4j.appender.E.Threshold = ERROR log4j.appender.E.file=/home/root/log/streaming/stderror.log log4j.appender.E.encoding=UTF-8
output info to files
log4j.appender.I=org.apache.log4j.DailyRollingFileAppender log4j.appender.I.layout=org.apache.log4j.PatternLayout log4j.appender.I.layout.conversionPattern=%-d{yyyy-MM-dd HH:mm:ss} %5p %t %c{2}:%L - %m%n log4j.appender.I.maxFileSize=100MB log4j.appender.I.maxBackupIndex=5 log4j.appender.I.Append = true log4j.appender.I.Threshold = INFO log4j.appender.I.file=/home/root/log/streaming/stdout.log log4j.appender.I.encoding=UTF-8
- ${SPARK_HOME}/conf/spark-defaults.conf```bashspark.eventLog.enabled truespark.eventLog.dir hdfs://SERVICE-HADOOP-admin-1//var/log/spark_hislogspark.history.fs.logDirectory hdfs://SERVICE-HADOOP-admin-1//var/log/spark_hislogspark.history.fs.update.interval 20sspark.history.fs.cleaner.enabled truespark.history.fs.cleaner.maxAge 30dspark.history.fs.cleaner.interval 1dspark.sql.warehouse.dir hdfs://SERVICE-HADOOP-admin-1//user/hive/warehousespark.driver.memory 4gspark.executor.memory 4gspark.driver.extraJavaOptions -XX:MaxPermSize=1024m -XX:PermSize=256mspark.port.maxRetries 100
