Apache Spark: Get number of records per partition

I’d use built-in function. It should be as efficient as it gets:

import org.apache.spark.sql.functions.spark_partition_id

df.groupBy(spark_partition_id).count

More Related Contents:

Spark – load CSV file as DataFrame?
How to define partitioning of DataFrame?
Encoder error while trying to map dataframe row to updated row
Spark Scala: How to convert Dataframe[vector] to DataFrame[f1:Double, …, fn: Double)]
How to define schema for custom type in Spark SQL?
How can I change column types in Spark SQL’s DataFrame?
Spark unionAll multiple dataframes
How can I pass extra parameters to UDFs in Spark SQL?
SparkSQL: How to deal with null values in user defined function?
How to split a dataframe into dataframes with same column values?
How to aggregate values into collection after groupBy?
Dropping a nested column from Spark DataFrame
How to define a custom aggregation function to sum a column of Vectors?
Spark DataFrame: does groupBy after orderBy maintain that order?
Better way to convert a string field into timestamp in Spark
Filling gaps in timeseries Spark
Spark UDAF with ArrayType as bufferSchema performance issues
Apache Spark how to append new column from list/array to Spark dataframe
How to get ID of a map task in Spark?
How to use COGROUP for large datasets
What are possible reasons for receiving TimeoutException: Futures timed out after [n seconds] when working with Spark [duplicate]
Why does join fail with “java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]”?
Create new Dataframe with empty/null field values
How to Define Custom partitioner for Spark RDDs of equally sized partition where each partition has equal number of elements?
DataFrame-ified zipWithIndex
Spark: Add column to dataframe conditionally
Apache Spark, add an “CASE WHEN … ELSE …” calculated column to an existing DataFrame
Calculate Cosine Similarity Spark Dataframe
Why is join not possible after show operator?
Difference between two rows in Spark dataframe

More Related Contents:

Leave a Comment Cancel reply