Feature #15087
Updated by Tom Clegg almost 6 years ago
*Background:* Currently Workbench1 has "busy/idle nodes" counters on the dashboard, but they stop working or disappear if the deprecated crunch1 services are not running. This issue suggests a low-cost way to maintain some semblance of an "is anything happening?" indicator on Workbench after migrating to crunch2.
*Feature:* On the Workbench1 dashboard, if crunch2 is enabled, show
* the number of containers (visible to the current user) that have @state=Queued@ and @priority>0@
* the number of containers (visible to the current user) that have @state=Locked@ or @Running@
* the earliest start time of any running container
* how long the oldest visible queued container has been waiting
Benefits:
* Easy to implement[1] in Workbench in a way that works with all dispatch setups
* Corresponds to reasonable user expectations ("it shouldn't take 2 hours to start a container")
Shortcomings:
* "Lots of other users' containers are queued ahead of yours" looks identical to "nothing is running at all" (assuming user is not admin)
* "Cluster is at capacity, with long-running containers" looks identical to "cluster is unable to run anything at all"
* Doesn't take advantage of the metrics we are (or could be) tracking in arvados-dispatch-cloud, like recent queued-to-starting delays and # busy/idle/booting cloud instances.
fn1. Assuming we aren't too picky about the definition of "oldest" -- currently we don't record how long a container has been ready to run, only when it was created (since when it might have spent lots of time having priority=0) and when it was last modified (at which point it might have merely raised its priority long after it was ready to run)