Prabhath Kota: spark-submit - how to do memory tuning?

May 13, 2023

spark-submit - how to do memory tuning?

scenario1

In Spark, Given 10 GB data (say in 100 node cluster), how many executors you choose?
10 GB -> 80 blocks -> 160 tasks -> 40 cores -> 8 executors
100 GB -> 800 blocks -> 1600 tasks -> 400 cores -> 80 executors
1 Executor = 5 cores max
1 core = 4 tasks

scenario2

Find number of executors for better performance?

Executor core
Executor Memory
Find No of executors
No of nodes = 5, each node = 16 cores, Each node RAM = 64 GB
Node 1 -> 16 cores - 16/5 = 3 executors
Node 1 -> 16 cores - 16/5 = 3 executors
Node 1 -> 16 cores - 16/5 = 3 executors
Node 1 -> 16 cores - 16/5 = 3 executors
Node 1 -> 16 cores - 16/5 = 3 executors
So total 15 (5*3) executors

How to find Executor memory for better performance?

Each node 64 GB
64 / 3 executors (from above) = 21 GB memory approx
Deduct Off Heap Memory (7%) - deduct heap overhead
21 * 0.07 = 1.47 i.e., 2 GB approx
21 - 2 GB = 19 GB
21*(1-0.07)

Finally

No of executors (--num_executors) = 15
Executor core (--executor-cores = 5) = 5 (default)
Executor Memory (--executor-memory) = 19 GB
Driver memory (--driver-memory)

The common practice among data engineers is to configure driver memory relatively small compared to the executors
AWS recommends match the same with executor memory

Driver Cores (--driver-cores)

Default is 1
However, I’ve found that jobs using more than 500 Spark cores can experience a performance benefit if the driver core count is set to match the executor core count.

No of nodes = 10, each node = 16 cores, Each node RAM = 64 GB
Let’s assign 5 core per executors => --executor-cores = 5 (for good HDFS throughput)
Leave 1 core per node for Hadoop/Yarn daemons => Num cores available per node = 16 -1 = 15 (Each node is 16 cores)
So, Total available of cores in cluster = 15 x 10 = 150
Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30
Leaving 1 executor for ApplicationManager => --num-executors = 29
Number of executors per node = 30/10 = 3
Memory per executor = 64GB/3 = 21GB
Counting off heap overhead = 7% of 21GB = 3GB.

So, actual --executor-memory = 21 - 3 = 18GB

So, the recommended config is:

29 executors, 18GB memory each and 5 cores each!!

Dynamic Memory allocation

Giving the values dynamically based on data

--executor-cores, --num-executors, --executor-memory

Running executors with too much memory often results in excessive garbage collection delays.
Running tiny executors (with a single core and just enough memory needed to run a single task, for example) throws away the benefits that come from running multiple tasks in a single JVM.

Prabhath Kota

May 13, 2023

spark-submit - how to do memory tuning?

No comments:

Post a Comment