This is happening because you’re using python 3.8. The latest pip release of pyspark (pyspark 2.4.4 at time of writing) doesn’t support python 3.8. Downgrade to python 3.7 for now, and you should be fine.
More Related Contents:
- How to melt Spark DataFrame?
- Using a column value as a parameter to a spark DataFrame function
- Unpivot in spark-sql/pyspark
- java.lang.IllegalArgumentException at org.apache.xbean.asm5.ClassReader.(Unknown Source) with Java 10
- Split Spark Dataframe string column into multiple columns
- Avoid performance impact of a single partition mode in Spark window functions
- Spark Dataframe validating column names for parquet writes
- How to check if spark dataframe is empty?
- TypeError: Column is not iterable – How to iterate over ArrayType()?
- How to split a list to multiple columns in Pyspark?
- pyspark dataframe filter or include based on list
- Pyspark : forward fill with last observation for a DataFrame
- Adding a group count column to a PySpark dataframe
- Multiple Spark applications with HiveContext
- How to loop through each row of dataFrame in pyspark
- Fill in null with previously known good value with pyspark
- Why does Spark think this is a cross / Cartesian join
- Pyspark: Pass multiple columns in UDF
- Rename more than one column using withColumnRenamed
- Keep only duplicates from a DataFrame regarding some field
- reading json file in pyspark
- pyspark: Efficiently have partitionBy write to same number of total partitions as original table
- Spark gives a StackOverflowError when training using ALS
- How can I access python variable in Spark SQL?
- PySpark: How to fillna values in dataframe for specific columns?
- How to exclude multiple columns in Spark dataframe in Python
- Spark + s3 – error – java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
- Get CSV to Spark dataframe
- spark.ml StringIndexer throws ‘Unseen label’ on fit()
- PySpark error: AttributeError: ‘NoneType’ object has no attribute ‘_jvm’