Convert null values to empty array in Spark DataFrame
You can use an UDF: import org.apache.spark.sql.functions.udf val array_ = udf(() => Array.empty[Int]) combined with WHEN or COALESCE: df.withColumn(“myCol”, when(myCol.isNull, array_()).otherwise(myCol)) df.withColumn(“myCol”, coalesce(myCol, array_())).show In the recent versions you can use array function: import org.apache.spark.sql.functions.{array, lit} df.withColumn(“myCol”, when(myCol.isNull, array().cast(“array<integer>”)).otherwise(myCol)) df.withColumn(“myCol”, coalesce(myCol, array().cast(“array<integer>”))).show Please note that it will work only if conversion from string to the … Read more