Left anti join is what you’re looking for:
df1.join(df2, ["userid", "group"], "leftanti")
but the same thing can be done with left outer join:
(df1
.join(df2, ["userid", "group"], "leftouter")
.where(df2["pick"].isNull())
.drop(df2["pick"]))
More Related Contents:
- How to run independent transformations in parallel using PySpark?
- How to make good reproducible Apache Spark examples
- How to pivot Spark DataFrame?
- How to add a constant column in a Spark DataFrame?
- Spark Dataframe distinguish columns with duplicated name
- How do I add a new column to a Spark DataFrame (using PySpark)?
- How to change a dataframe column from String type to Double type in PySpark?
- Count number of non-NaN entries in each column of Spark dataframe with Pyspark
- How to access element of a VectorUDT column in a Spark DataFrame?
- Retrieve top n in each group of a DataFrame in pyspark
- Filter Spark DataFrame based on another DataFrame that specifies denylist criteria
- Updating a dataframe column in spark
- Dividing complex rows of dataframe to simple rows in Pyspark
- Filter Pyspark dataframe column with None value
- PySpark converting a column of type ‘map’ to multiple columns in a dataframe
- How to explode multiple columns of a dataframe in pyspark
- Spark add new column to dataframe with value from previous row
- PySpark: multiple conditions in when clause
- How to loop through each row of dataFrame in pyspark
- Why does Spark think this is a cross / Cartesian join
- Pyspark: Replacing value in a column by searching a dictionary
- Count number of non-NaN entries in each column of Spark dataframe in PySpark
- Create Spark DataFrame. Can not infer schema for type
- Pivot String column on Pyspark Dataframe
- spark dataframe drop duplicates and keep first
- How to exclude multiple columns in Spark dataframe in Python
- Multiple condition filter on dataframe
- How to return a “Tuple type” in a UDF in PySpark?
- spark.ml StringIndexer throws ‘Unseen label’ on fit()
- SparkSQL on pyspark: how to generate time series?