Spark Dataframes UPSERT to Postgres Table

It is not supported. DataFrameWriter can either append to or overwrite existing table. If your application requires more complex logic you’ll have to deal with this manually.

One option is to use an action (foreach, foreachPartition) with standard JDBC connection. Another one is to write to a temporary and handle the rest directly in the database.

See also SPARK-19335 (Spark should support doing an efficient DataFrame Upsert via JDBC) and related proposals.

Leave a Comment