This kind of operation is called left semi join in spark:
df_B.join(df_A, ['col1'], 'leftsemi')
More Related Contents:
- Convert pyspark string to date format
- Calling Java/Scala function from a task
- How to use JDBC source to write and read data in (Py)Spark?
- How to link PyCharm with PySpark?
- Applying UDFs on GroupedData in PySpark (with functioning python example)
- Pyspark: Split multiple array columns into rows
- How to change a dataframe column from String type to Double type in PySpark?
- Count number of non-NaN entries in each column of Spark dataframe with Pyspark
- importing pyspark in python shell
- collect_list by preserving order based on another variable
- AttributeError: ‘DataFrame’ object has no attribute ‘map’
- Spark DataFrame: Computing row-wise mean (or any aggregate operation)
- How to use a Scala class inside Pyspark
- How to add third-party Java JAR files for use in PySpark
- Spark: Broadcast variables: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion
- Filter Pyspark dataframe column with None value
- PySpark logging from the executor
- Convert spark DataFrame column to python list
- Count number of non-NaN entries in each column of Spark dataframe in PySpark
- Load CSV file with PySpark
- Reshaping/Pivoting data in Spark RDD and/or Spark DataFrames
- How to pass a constant value to Python UDF?
- How to zip two array columns in Spark SQL
- environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
- Multiple condition filter on dataframe
- How to return a “Tuple type” in a UDF in PySpark?
- PySpark Throwing error Method __getnewargs__([]) does not exist
- How to connect HBase and Spark using Python?
- PySpark Evaluation
- What is the difference between spark-submit and pyspark?