PySpark app in k8s cluster
Building and packaging a pyspark application is similar to building spark app.
Build Pyspark Image
docker-image-tool.sh -r repo.logpoint.com.np/py-spark -t v0.0.1 -p /opt/spark/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
Pushing Pyspark Image to registry
docker-image-tool.sh -r repo.logpoint.com.np/py-spark -t v0.0.1 -p /opt/spark/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
Running Pyspark Image on kubernetes
spark-submit \
--master k8s://https://192.168.2.55:6443 \
--deploy-mode cluster \
--name spark-pi \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=registry.logpoint.com.np/sparkjob:0.0.4 \
local:////opt/spark/examples/src/main/python/pi.py
It will run the spark pi example in python using pi.py file. Additional code can be injected into the container and executed.