AttributeError: 'DataFrame' object has no attribute 'map'

You can’t map a dataframe, but you can convert the dataframe to an RDD and map that by doing spark_df.rdd.map(). Prior to Spark 2.0, spark_df.map would alias to spark_df.rdd.map(). With Spark 2.0, you must explicitly call .rdd first.

More Related Contents:

Calling Java/Scala function from a task
Spark RDD to DataFrame python
Convert spark DataFrame column to python list
How to create a custom Estimator in PySpark
Spark mllib predicting weird number or NaN
Finding duplicates from large data set using Apache Spark
How to find median and quantiles using Spark
Spark Dataframe distinguish columns with duplicated name
How to change dataframe column names in pyspark?
I can’t seem to get –py-files on Spark to work
How can we JOIN two Spark SQL dataframes using a SQL-esque “LIKE” criterion?
How to add third-party Java JAR files for use in PySpark
Pyspark ‘NoneType’ object has no attribute ‘_jvm’ error
Cast column containing multiple string date formats to DateTime in Spark
How to explode multiple columns of a dataframe in pyspark
How to extract an element from a array in pyspark
PySpark: multiple conditions in when clause
Create Spark DataFrame. Can not infer schema for type
How to transform data with sliding window over time series data in Pyspark
Create single row dataframe from list of list PySpark
Build a hierarchy from a relational data-set using Pyspark
How do I convert an array (i.e. list) column to Vector
Add column sum as new column in PySpark dataframe
Pyspark – converting json string to DataFrame
How to add suffix and prefix to all columns in python/pyspark dataframe
Spark DAG differs with ‘withColumn’ vs ‘select’
How to drop all columns with null values in a PySpark DataFrame?
Pyspark changing type of column from date to string
PySpark: compute row maximum of the subset of columns and add to an exisiting dataframe
Random numbers generation in PySpark

AttributeError: ‘DataFrame’ object has no attribute ‘map’

Leave a Comment Cancel reply