following should work
Sample DataFrame
some_df = sc.parallelize([
("A", "no"),
("B", "yes"),
("B", "yes"),
("B", "no")]
).toDF(["user_id", "phone_number"])
Converting DataFrame to Pandas DataFrame
pandas_df = some_df.toPandas()
More Related Contents:
- Requirements for converting Spark dataframe to Pandas/R dataframe
- Why is Apache-Spark – Python so slow locally as compared to pandas?
- collect() or toPandas() on a large DataFrame in pyspark/EMR
- Converting Pandas dataframe into Spark dataframe error
- How to zip two array columns in Spark SQL
- How to split data into 3 sets (train, validation and test)?
- Overwrite specific partitions in spark dataframe write method
- How to define schema for custom type in Spark SQL?
- Convert date from String to Date format in Dataframes
- Count number of non-NaN entries in each column of Spark dataframe with Pyspark
- What is the meaning of partitionColumn, lowerBound, upperBound, numPartitions parameters?
- Spark unionAll multiple dataframes
- How can I pass extra parameters to UDFs in Spark SQL?
- How to create an empty DataFrame with a specified schema?
- Spark DataFrame: Computing row-wise mean (or any aggregate operation)
- How to define a custom aggregation function to sum a column of Vectors?
- Better way to convert a string field into timestamp in Spark
- (pandas) Create new column based on first element in groupby object
- How to loop through each row of dataFrame in pyspark
- How to group by common element in array?
- Provide schema while reading csv file as a dataframe in Scala Spark
- Reshaping/Pivoting data in Spark RDD and/or Spark DataFrames
- Cannot find col function in pyspark
- SPARK SQL – update MySql table using DataFrames and JDBC
- Calculate Cosine Similarity Spark Dataframe
- How do I convert an array (i.e. list) column to Vector
- How to remove illegal characters so a dataframe can write to Excel
- Turn pandas dataframe into a file-like object in memory?
- Spark SQL referencing attributes of UDT
- Difference between two rows in Spark dataframe