What are possible reasons for receiving TimeoutException: Futures timed out after [n seconds] when working with Spark [duplicate]

Question : I was wondering what may cause this exception to be thrown?

Answer :

spark.sql.broadcastTimeout 300 Timeout in seconds for the broadcast
wait time in broadcast joins

spark.network.timeout 120s Default timeout for all network interactions.. spark.network.timeout (spark.rpc.askTimeout), spark.sql.broadcastTimeout,
spark.kryoserializer.buffer.max(if you are using kryo
serialization), etc. are tuned with larger-than-default values in
order to handle complex queries. You can start with these values and
adjust accordingly to your SQL workloads.

Note : Doc says that

The following options(see spark.sql. properties) can also be used to tune the performance of query execution. It is possible that these options will be deprecated in future release as more optimizations are performed automatically.*

Also,for your better understanding you can see BroadCastHashJoin where execute method is trigger point for the above stack trace.

protected override def doExecute(): RDD[Row] = {
    val broadcastRelation = Await.result(broadcastFuture, timeout)

    streamedPlan.execute().mapPartitions { streamedIter =>
      hashJoin(streamedIter, broadcastRelation.value)
    }
  }

Leave a Comment