Project

General

Profile

Feature #6520

Updated by Brett Smith almost 8 years ago

Add one node to the wishlist for each queued container, just like we currently add one (or more) nodes to the wishlist for queued jobs.    While Crunch v2 will support running multiple containers per node, that's less critical in the cloud: as long as we can boot approximately the right size node, there's not too much overhead in just having one node per container.    And it's something we can do relatively quickly with the current Node Manager code. 

 This won't be perfect from a scheduling perspective, especially in the interaction between Crunch v1 and Crunch v2.    We expect that Crunch v2 jobs will generally "take priority" over Crunch v1 jobs, because SLURM will dispatch them from its own queue before crunch-dispatch has a chance to look and allocate nodes.    We're OK with that limitation for the time being. 

 Node Manager should get the list of queued containers from SLURM itself, because that's the most direct source of truth about what is waiting to run.    Node Manager can get information about the runtime constraints of each container either from SLURM, or from the Containers API. 

 Acceptance criteria: 

 * Node Manager can generate a wishlist that is informed by containers in the SLURM queue.    (Whether that's the existing wishlist or a new one is an implementation detail, not an acceptance criteria either way.) 
 * The node sizes in that wishlist are the smallest able to meet the runtime constraints of the respective containers. 
 * The Daemon actor considers these wishlist items when deciding whether or not to boot or shut down nodes, just as it does with the wishlist generated from the job queue today. 

Back