How to export a table dataframe in PySpark to csv?

If data frame fits in a driver memory and you want to save to local files system you can convert Spark DataFrame to local Pandas DataFrame using toPandas method and then simply use to_csv:

df.toPandas().to_csv('mycsv.csv')

Otherwise you can use spark-csv:

  • Spark 1.3

    df.save('mycsv.csv', 'com.databricks.spark.csv')
    
  • Spark 1.4+

    df.write.format('com.databricks.spark.csv').save('mycsv.csv')
    

In Spark 2.0+ you can use csv data source directly:

df.write.csv('mycsv.csv')

Leave a Comment