Encode and assemble multiple features in PySpark
Spark >= 2.3, >= 3.0 Since Spark 2.3 OneHotEncoder is deprecated in favor of OneHotEncoderEstimator. If you use a recent release please modify encoder code from pyspark.ml.feature import OneHotEncoderEstimator encoder = OneHotEncoderEstimator( inputCols=[“gender_numeric”], outputCols=[“gender_vector”] ) In Spark 3.0 this variant has been renamed to OneHotEncoder: from pyspark.ml.feature import OneHotEncoder encoder = OneHotEncoder( inputCols=[“gender_numeric”], outputCols=[“gender_vector”] ) … Read more