May 13, 2023

spark-submit - how to do memory tuning?


scenario1
  • In Spark, Given 10 GB data (say in 100 node cluster), how many executors you choose?
  • 10 GB -> 80 blocks -> 160 tasks -> 40 cores -> 8 executors
  • 100 GB -> 800 blocks -> 1600 tasks -> 400 cores -> 80 executors
  • 1 Executor = 5 cores max 
  • 1 core = 4 tasks
scenario2
  • Find number of executors for better performance?
    • Executor core
    • Executor Memory
    • Find No of executors
    • No of nodes = 5, each node = 16 cores, Each node RAM = 64 GB
    • Node 1 -> 16 cores - 16/5 = 3 executors
    • Node 1 -> 16 cores - 16/5 = 3 executors
    • Node 1 -> 16 cores - 16/5 = 3 executors
    • Node 1 -> 16 cores - 16/5 = 3 executors
    • Node 1 -> 16 cores - 16/5 = 3 executors
    • So total 15 (5*3) executors
  • How to find Executor memory for better performance?
    • Each node 64 GB
    • 64 / 3 executors (from above) = 21 GB memory approx
    • Deduct Off Heap Memory (7%) - deduct heap overhead
    • 21 * 0.07 = 1.47 i.e., 2 GB approx
    • 21 - 2 GB = 19 GB 
    • 21*(1-0.07)
  • Finally
    • No of executors (--num_executors) = 15
    • Executor core (--executor-cores = 5)  = 5 (default)
    • Executor Memory (--executor-memory) = 19 GB
    • Driver memory (--driver-memory)
      • The common practice among data engineers is to configure driver memory relatively small compared to the executors
      • AWS recommends match the same with executor memory
    • Driver Cores  (--driver-cores)
      • Default is 1
      • However, I’ve found that jobs using more than 500 Spark cores can experience a performance benefit if the driver core count is set to match the executor core count. 

No of nodes = 10, each node = 16 cores, Each node RAM = 64 GB
Let’s assign 5 core per executors => --executor-cores = 5 (for good HDFS throughput)
Leave 1 core per node for Hadoop/Yarn daemons => Num cores available per node = 16 -1 = 15  (Each node is 16 cores)
So, Total available of cores in cluster = 15 x 10 = 150
Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30
Leaving 1 executor for ApplicationManager => --num-executors = 29
Number of executors per node = 30/10 = 3
Memory per executor = 64GB/3 = 21GB
Counting off heap overhead = 7% of 21GB = 3GB. 
  • So, actual --executor-memory = 21 - 3 = 18GB
So, the recommended config is: 
  • 29 executors, 18GB memory each and 5 cores each!!


Dynamic Memory allocation

Giving the values dynamically based on data 
  • --executor-cores, --num-executors, --executor-memory

Running executors with too much memory often results in excessive garbage collection delays.
Running tiny executors (with a single core and just enough memory needed to run a single task, for example) throws away the benefits that come from running multiple tasks in a single JVM.

No comments:

Post a Comment