Spark On Kubernetes 部署指南

2025/07/08 Spark Kubernetes 共 4122 字,约 12 分钟

🚀 Spark on Kubernetes 部署指南

本指南演示如何在 Kubernetes 上安装并运行 Spark,包括:

  1. 安装 Spark Operator
  2. 配置 ServiceAccount 和权限
  3. 创建持久化存储卷
  4. 编写并部署 Spark Application

1️⃣ 安装 Spark Operator

推荐使用 Helm 安装:

# 添加 Helm 仓库
helm repo add spark-operator https://kubeflow.github.io/spark-operator
helm repo update

# 安装 Spark Operator
helm install spark-operator spark-operator/spark-operator \
  --namespace spark-operator \
  --create-namespace \
  --set spark.jobNamespaces="{spark-operator}" \
  --set controller.nodeSelector."kubernetes\.io/hostname"="xxt" \
  --set webhook.nodeSelector."kubernetes\.io/hostname"="xxt"

📦 网络不佳时,可下载源码包安装:

wget https://github.com/kubeflow/spark-operator/releases/download/spark-operator-chart-1.4.6/spark-operator-1.4.6.tgz

helm install spark-operator ./spark-operator-1.4.6.tgz \
  --namespace spark-operator \
  --create-namespace

安装验证:

kubectl get pods -n spark-operator

示例输出:

NAME                              READY   STATUS    RESTARTS   AGE
spark-operator-7dcb44748b-cbjfr   1/1     Running   0          2d18h

确认监听的命名空间:

kubectl get deployment spark-operator-controller -n spark-operator -o yaml | grep namespaces

2️⃣ 配置 ServiceAccount 和权限

Spark Driver 和 Executor 需要具备集群访问权限。

2.1 创建 ServiceAccount

创建 spark-account.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: spark-account
  namespace: spark-operator

2.2 创建 ClusterRoleBinding

创建 spark-role-binding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: spark-cluster-role-binding
subjects:
  - kind: ServiceAccount
    name: spark-account
    namespace: spark-operator
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

⚠️ 提示: 出于安全考虑,生产环境请使用最小权限原则,避免直接使用 cluster-admin

2.3 应用配置

kubectl apply -f spark-account.yaml
kubectl apply -f spark-role-binding.yaml

验证:

kubectl get serviceaccount spark-account -n spark-operator
kubectl get clusterrolebinding spark-cluster-role-binding

✅ 成功后,spark-account 已具备权限。


3️⃣ 创建存储卷

如需挂载共享数据,可配置 PV 和 PVC。

3.1 创建 PersistentVolume (PV)

创建 pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: spark-pv
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: /Users/xjw/software/spark-operator/data/spark
    type: DirectoryOrCreate
  storageClassName: local-storage

📢 注意: hostPath 仅用于单节点测试环境

3.2 创建 PersistentVolumeClaim (PVC)

创建 pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: spark-pvc
  namespace: spark-operator
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
  storageClassName: local-storage

3.3 部署 PV 和 PVC

kubectl apply -f pv.yaml
kubectl apply -f pvc.yaml

验证绑定状态:

kubectl get pv
kubectl get pvc -n spark-operator

4️⃣ 部署 Spark Application

以下示例运行 Spark Pi:

创建 spark-custom.yaml

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi-custom-resource
  namespace: spark-operator
spec:
  type: Scala
  mode: cluster
  image: spark:3.5.5
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.5.5.jar"
  sparkVersion: 3.5.5
  restartPolicy:
    type: Never
  arguments:
    - "10000"
  sparkConf:
    "spark.driver.extraJavaOptions": "-Dlog4j.configurationFile=file:/mnt/data/log4j2.properties"
    "spark.executor.extraJavaOptions": "-Dlog4j.configurationFile=file:/mnt/data/log4j2.properties"
  volumes:
    - name: spark-storage
      persistentVolumeClaim:
        claimName: spark-pvc
    - name: host-time
      hostPath:
        path: /etc/localtime
  driver:
    cores: 1
    memory: 512m
    serviceAccount: spark-account
    nodeSelector:
      kubernetes.io/hostname: xjw
    volumeMounts:
      - name: spark-storage
        mountPath: /mnt/data
      - name: host-time
        mountPath: /etc/localtime
        readOnly: true
    env:
      - name: TZ
        value: Asia/Shanghai
  executor:
    cores: 1
    instances: 2
    memory: 512m
    nodeSelector:
      kubernetes.io/hostname: xjw
    volumeMounts:
      - name: spark-storage
        mountPath: /mnt/data
      - name: host-time
        mountPath: /etc/localtime
        readOnly: true
    env:
      - name: TZ
        value: Asia/Shanghai

部署:

kubectl apply -f spark-custom.yaml

5️⃣ 验证运行与挂载

查看 Application 状态:

kubectl describe sparkapplication spark-pi-custom-resource -n spark-operator

确认挂载:

Mounts:
  /mnt/data from spark-storage (rw)

查看日志:

kubectl logs spark-pi-custom-resource-driver -n spark-operator

若运行成功,日志末尾应包含:

Pi is roughly 3.1460957304786525

6️⃣ 清理资源

删除 Pod 和 Spark Application:

kubectl delete pod spark-pi-custom-resource-driver -n spark-operator
kubectl delete sparkapplication spark-pi-custom-resource -n spark-operator

✨ 小贴士

版本兼容 spec.sparkVersion 与镜像版本需保持一致。

访问 Spark UI 可使用端口转发:

kubectl port-forward svc/spark-pi-custom-resource-ui-svc 4040:4040 -n spark-operator

多租户安全 生产环境建议自定义 Role 而非 cluster-admin


📚 参考资料

文档信息

Search

    Table of Contents