environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON

This may happen also if you’re working within an environment. In this case, it may be harder to retrieve the correct path to the python executable (and anyway I think it’s not a good idea to hardcode the path if you want to share it with others).

If you run the following lines at the beginning of your script/notebook (at least before you create the SparkSession/SparkContext) the problem is solved:

import os
import sys

os.environ['PYSPARK_PYTHON'] = sys.executable
os.environ['PYSPARK_DRIVER_PYTHON'] = sys.executable

Package os allows you to set global variables; package sys gives the string with the absolute path of the executable binary for the Python interpreter.

Leave a Comment