How to sort an RDD in Scala Spark?

If you only need the top 10, use rdd.top(10). It avoids sorting, so it is faster.

rdd.top makes one parallel pass through the data, collecting the top N in each partition in a heap, then merges the heaps. It is an O(rdd.count) operation. Sorting would be O(rdd.count log rdd.count), and incur a lot of data transfer — it does a shuffle, so all of the data would be transmitted over the network.

More Related Contents:

How does HashPartitioner work?
How to convert rdd object to dataframe in spark
(Why) do we need to call cache or persist on a RDD
Case class equality in Apache Spark
Spark performance for Scala vs Python
Stackoverflow due to long RDD Lineage
How to find spark RDD/Dataframe size?
Modify collection inside a Spark RDD foreach
Parsing multiline records in Scala
Explanation of fold method of spark RDD
Why does partition parameter of SparkContext.textFile not take effect?
Why does Spark RDD partition has 2GB limit for HDFS?
Reading in multiple files compressed in tar.gz archive into Spark [duplicate]
How to transpose an RDD in Spark
Automatically and Elegantly flatten DataFrame in Spark SQL
How do I skip a header from CSV files in Spark?
Spark extracting values from a Row
How to pass -D parameter or environment variable to Spark job?
Why is “Unable to find encoder for type stored in a Dataset” when creating a dataset of custom case class?
Append a column to Data Frame in Apache Spark 1.3
Processing multiple files as independent RDD’s in parallel
MatchError while accessing vector column in Spark 2.0
Spark: what’s the best strategy for joining a 2-tuple-key RDD with single-key RDD?
Spark / Scala: forward fill with last observation
How to convert unix timestamp to date in Spark
Apache Spark: Get number of records per partition
Derive multiple columns from a single column in a Spark DataFrame
How to save a spark DataFrame as csv on disk?
How to process multi line input records in Spark
How to sort by column in descending order in Spark SQL?

More Related Contents:

Leave a Comment Cancel reply