Running custom Java class in PySpark

In PySpark try the following

from py4j.java_gateway import java_import
java_import(sc._gateway.jvm,"org.foo.module.Foo")

func = sc._gateway.jvm.Foo()
func.fooMethod()

Make sure that you have compiled your Java code into a runnable jar and submit the spark job like so

spark-submit --driver-class-path "name_of_your_jar_file.jar" --jars "name_of_your_jar_file.jar" name_of_your_python_file.py

Leave a Comment