Concatenate two PySpark dataframes

Maybe you can try creating the unexisting columns and calling union (unionAll for Spark 1.6 or lower):

from pyspark.sql.functions import lit

cols = ['id', 'uniform', 'normal', 'normal_2']    

df_1_new = df_1.withColumn("normal_2", lit(None)).select(cols)
df_2_new = df_2.withColumn("normal", lit(None)).select(cols)

result = df_1_new.union(df_2_new)

# To remove the duplicates:

result = result.dropDuplicates()

More Related Contents:

Convert pyspark string to date format
How to split Vector into columns – using PySpark
How to use JDBC source to write and read data in (Py)Spark?
Load CSV file with Spark
Applying UDFs on GroupedData in PySpark (with functioning python example)
Pyspark: Split multiple array columns into rows
How to change a dataframe column from String type to Double type in PySpark?
Count number of non-NaN entries in each column of Spark dataframe with Pyspark
Retrieve top n in each group of a DataFrame in pyspark
Pyspark: explode json in column to multiple columns
Spark DataFrame: Computing row-wise mean (or any aggregate operation)
How to use a Scala class inside Pyspark
Filter Pyspark dataframe column with None value
Cast column containing multiple string date formats to DateTime in Spark
How to explode multiple columns of a dataframe in pyspark
PySpark: multiple conditions in when clause
Apache Spark Python Cosine Similarity over DataFrames
Count number of non-NaN entries in each column of Spark dataframe in PySpark
Load CSV file with PySpark
Reshaping/Pivoting data in Spark RDD and/or Spark DataFrames
Cannot find col function in pyspark
What is the best way to remove accents with Apache Spark dataframes in PySpark?
How to pass a constant value to Python UDF?
How to zip two array columns in Spark SQL
When to cache a DataFrame?
Multiple condition filter on dataframe
How to return a “Tuple type” in a UDF in PySpark?
How to connect HBase and Spark using Python?
How to pivot on multiple columns in Spark SQL?
Apache Spark — Assign the result of UDF to multiple dataframe columns

More Related Contents:

Leave a Comment Cancel reply