How does HashPartitioner work?

Well, lets make your dataset marginally more interesting: val rdd = sc.parallelize(for { x <- 1 to 3 y <- 1 to 2 } yield (x, None), 8) We have six elements: rdd.count Long = 6 no partitioner: rdd.partitioner Option[org.apache.spark.Partitioner] = None and eight partitions: rdd.partitions.length Int = 8 Now lets define small helper to … Read more