Methods used to determine the best values for your workload
Type 1. Calculated. They can be set once. E.g.
- yarn.nodemanager.resource.memory-mb=163840 . It is Total physical memory size (in this case 216GB)x( 1 – 25% ).
- yarn.scheduler.maximum-allocation-mb = yarn.nodemanager.resource.memory-mb
- yarn.scheduler.minimum-allocation-mb =512 (fixed value)
Type 2. Tuned based on workload and data size. They are:
- mapreduce.map.memory.mb, mapreduce.reduce.memory.mb and yarn.app.mapreduce.am.resource.mb
- mapreduce.map.java.opts = 80% x mapreduce.map.memory.mb
- mapreduce.reduce.java.opts = 80% x mapreduce.reduce.java.opts
- yarn.app.mapreduce.am.command-opts = 80% x yarn.app.mapreduce.am.resource.mb
https://developer.ibm.com/hadoop/2016/01/21/tune-yarn-mapreduce-memory-speed-big-sql-load-analyze/
https://discuss.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters