Skip to content

PySpark app in k8s cluster

Building and packaging a pyspark application is similar to building spark app.

Build Pyspark Image

docker-image-tool.sh -r repo.logpoint.com.np/py-spark -t v0.0.1 -p /opt/spark/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build

Pushing Pyspark Image to registry

docker-image-tool.sh -r repo.logpoint.com.np/py-spark -t v0.0.1 -p /opt/spark/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build

Running Pyspark Image on kubernetes

spark-submit \
    --master k8s://https://192.168.2.55:6443 \
    --deploy-mode cluster \
    --name spark-pi \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.container.image=registry.logpoint.com.np/sparkjob:0.0.4 \
    local:////opt/spark/examples/src/main/python/pi.py

It will run the spark pi example in python using pi.py file. Additional code can be injected into the container and executed.