1. 拉取镜像
1. 创建存放资源的文件夹
mkdir -p ~/k8smkdir ~/k8s/spark-helmcd ~/k8s/spark-helm
2. 从官方Helm库拉取镜像
helm search repo sparkhelm install incubator/sparkoperator --namespace spark-operator
2. 更改配置
CRD结构
Spark Operator涉及到的CRD结构如下:
ScheduledSparkApplication|__ ScheduledSparkApplicationSpec|__ SparkApplication|__ ScheduledSparkApplicationStatus|__ SparkApplication|__ SparkApplicationSpec|__ DriverSpec|__ SparkPodSpec|__ ExecutorSpec|__ SparkPodSpec|__ Dependencies|__ MonitoringSpec|__ PrometheusSpec|__ SparkApplicationStatus|__ DriverInfo
作业编排
如果我要提交一个作业,那么我就可以定义如下一个SparkApplication的yaml,关于yaml里面的字段含义,可以参考CRD文档。
apiVersion: sparkoperator.k8s.io/v1beta1kind: SparkApplicationmetadata:...spec:deps: {}driver:coreLimit: 200mcores: 0.1labels:version: 2.3.0memory: 512mserviceAccount: sparkexecutor:cores: 1instances: 1labels:version: 2.3.0memory: 512mimage: gcr.io/ynli-k8s/spark:v2.4.0mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jarmainClass: org.apache.spark.examples.SparkPimode: clusterrestartPolicy:type: OnFailureonFailureRetries: 3onFailureRetryInterval: 10onSubmissionFailureRetries: 5onSubmissionFailureRetryInterval: 20type: Scalastatus:sparkApplicationId: spark-5f4ba921c85ff3f1cb04bef324f9154c9applicationState:state: COMPLETEDcompletionTime: 2018-02-20T23:33:55ZdriverInfo:podName: spark-pi-83ba921c85ff3f1cb04bef324f9154c9-driverwebUIAddress: 35.192.234.248:31064webUIPort: 31064webUIServiceName: spark-pi-2402118027-ui-svcwebUIIngressName: spark-pi-ui-ingresswebUIIngressAddress: spark-pi.ingress.cluster.comexecutorState:spark-pi-83ba921c85ff3f1cb04bef324f9154c9-exec-1: COMPLETEDLastSubmissionAttemptTime: 2018-02-20T23:32:27Z
3. 提交作业
kubectl apply -f spark-pi.yaml
参考
知乎:Spark on Kubernetes的现状与挑战
https://zhuanlan.zhihu.com/p/76318638
