Mean field optimal Core Allocation across Malleable jobs

Published:

Download here

Zhouzi Li, Mor Harchol-Balter, Benjamin Berg (Under submission)

Abstract: Modern data centers and cloud computing clusters are increasingly tasked with hosting workloads composed of malleable jobs. A malleable job can be parallelized across any number of cores, yet the job typically exhibits diminishing marginal returns for each additional core on which it runs. We study the Core Allocation to Malleable jobs (CAM) problem under a highly general setting, allowing for multiple job classes with arbitrary concave speedup functions and holding costs. We analyze the CAM problem in the mean field asymptotic regime and derive two distinct mean field optimal policies. WHAM (Whittle Allocation for Malleable jobs) is interesting because it is asymptotically optimal and also serves as a good heuristic even outside of the asymptotic regime. Notably, none of the policies previously proposed in the literature are mean field optimal when jobs may follow different speedup functions.