Hadoop YARN 模式
一、提交任务
1. 提交 streaming 任务 spark-submit \ --name SafeRealtimeClickStreaming \ --queue root.realtime \ --class com.dw2345.dw_realtime.module.safe.streaming.SafeRealtimeClickStreaming \ --master yarn \ --deploy-mode client \ --driver-cores 2 \ --driver-memory 4024M \ --executor-memory 4024M \ --num-executors 2 \ --conf "spark.executorEnv.JAVA_HOME=/usr/local/jdk1.8" \ --conf "spark.yarn.appMasterEnv.JAVA_HOME=/usr/local/jdk1.8" \ --conf "spark.executor.extraJavaOptions=-XX:+UseConcMarkSweepGC" \ --conf "spark.streaming.backpressure.enabled=true" \ --conf "spark.streaming.kafka.maxRatePerPartition=10000" \ --conf "spark.streaming.blockInterval=1000" \ --conf "spark.driver.extraClassPath=${SBT_HOME}/ivy-repository/cache/mysql/mysql-connector-java/jars/mysql-connector-java-5.1.30.jar" \ ~/app/dw_realtime/target/scala-2.11/dw_realtime.jar develop SafeRealtimeClickLog 10 /data/log/real_time/offset/SafeRealtimeClickPS: 垃圾回收和内存使用 通过打开 Java 的并发标识 - 清除收集器来减少 GC 引起的不可预测的长暂停,清除收集器总体上会耗费更多的资源,但是会较少暂停的发生 --conf "spark.executor.extraJavaOptions=-XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"2. 动态部署 spark-submit \ --name SafeRealtimeClickStreaming \ --queue root.realtime \ --class com.dw2345.dw_realtime.module.safe.streaming.SafeRealtimeClickStreaming \ --master yarn \ --deploy-mode client \ --driver-cores 2 \ --driver-memory 4024M \ --executor-memory 4024M \