Project

General

Profile

Feature #15087

Updated by Tom Clegg almost 5 years ago

*Background:* Currently Workbench1 has "busy/idle nodes" counters on the dashboard, but they stop working or disappear if the deprecated crunch1 services are not running.    This issue suggests a low-cost way to maintain some semblance of an "is anything happening?" indicator on Workbench after migrating to crunch2. 

 *Feature:* On the Workbench1 dashboard, if crunch2 is enabled, show 
 * the number of containers (visible to the current user) that have @state=Locked@, or @state=Queued@ and @priority>0@ 
 * the number of containers (visible to the current user) that have @state=Locked@ or @Running@ 
 * time since the earliest start time of any running container (visible to the current user) 
 * how long the oldest visible queued container has been waiting 

 Benefits: 
 * Easy to implement[1] in Workbench in a way that works with all dispatch setups 
 * Corresponds to reasonable user expectations ("it shouldn't take 2 hours to start a container") 

 Shortcomings: 
 * "Lots of other users' containers are queued ahead of yours" looks identical to "nothing is running at all" (assuming user is not admin) 
 * "Cluster is at capacity, with long-running containers" looks identical to "cluster is unable to run anything at all" 
 * Doesn't take advantage of the metrics we are (or could be) tracking in arvados-dispatch-cloud, like recent queued-to-starting delays and # busy/idle/booting cloud instances. 

 fn1. Assuming we aren't too picky about the definition of "oldest" -- currently we don't record how long a container has been ready to run, only when it was created (since when it might have spent lots of time having priority=0) and when it was last modified (at which point it might have merely raised its priority long after it was ready to run) 

Back