Digging into Spark Scheduler Delay

I posted the other day about the Event Timeline visualisation you can get in the Stages view of the Spark Application UI. What I didn’t cover was the Event Timeline you can get when you click through to the Stage details page. The Stage details page lists out all the tasks that are executed as part of the Stage processing; Tasks represent the actual unit of work processed by a Spark executor – there is one task for each partition.

Just like on the Stages screen there is also an option to display an Event Timeline that shows where and when each task is run. An example is shown in the diagram, it’s just running the following code:


val rdd = sc.makeRDD(1 to 1000)

rdd.count

Each task is a bar on the chart and different sections of the bar are colour-coded for different stages of the task execution:

Scheduler Delay
Task Deserialization Time
Shuffle Read Time
Executor Computing Time
Shuffle Write Time
Result Serialization Time
Getting Result Time

What’s interesting from the diagram is that 8 of the tasks executed much quicker than the others, the key difference being the Scheduler Delay (40ms v 600ms). The reason is that I have 2 different executors running, one on my laptop (the same machine that is running the driver) and one on a machine connected by my wifi. The wifi is pretty variable with ping times ranging from 30ms to 500ms. It looks very much like Scheduler Delay includes time waiting for communications between the driver and the executor. So if you have similar symptoms (tasks on different executors experiencing different levels of scheduler delay) then network wait time may be the cause.

(Machine) Learning at Speed

Data Science, Machine Learning and Big data

Digging into Spark Scheduler Delay

Leave a comment Cancel reply

	Reading AWS Serverle… on Reading AWS Serverless in Acti…
	Digging into Spark S… on Using the Spark Event Tim…
	Using the Spark Even… on Using the Spark Event Tim…

Share this:

Related

Leave a comment Cancel reply