You may encounter the following error message when running a Spark instance using a custom kernel in the Jupyter + Spark app:
25/04/25 10:49:01 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (10.6.7.6 executor 22): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/apps/spack/0.21/ascend/linux-rhel9-zen2/spark/gcc/11.4.1/3.5.1-lbffccn/python/lib/pyspark.zip/pyspark/worker.py", line 1100, in main raise PySparkRuntimeError( pyspark.errors.exceptions.base.PySparkRuntimeError: [PYTHON_VERSION_MISMATCH] Python in worker has different version (3, 12) than that in driver 3.9, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
This error indicates a mismatch between the Python version used in the Jupyter kernel and the Python version used by the Spark cluster. In this case, Python 3.12 was used by the Spark cluster, while the kernel was using Python 3.9. PySpark requires that both the driver and the worker use the same minor version of Python.
To resolve this issue, ensure that your Conda environment is created with Python 3.12, and that you correctly create a Spark session using the SparkSession module.
Affected version
3.5.1