Feature #15087
closed[Workbench] Show number of queued containers on dashboard (instead of busy/idle nodes)
Description
Background: Currently Workbench1 has "busy/idle nodes" counters on the dashboard, but they stop working or disappear if the deprecated crunch1 services are not running. This issue suggests a low-cost way to maintain some semblance of an "is anything happening?" indicator on Workbench after migrating to crunch2.
Feature: On the Workbench1 dashboard, if crunch2 is enabled, show- the number of containers (visible to the current user) that have
state=Locked
, orstate=Queued
andpriority>0
- the number of containers (visible to the current user) that have
Running
- time since the earliest start time of any running container (visible to the current user)
- how long the oldest visible queued container has been waiting
- Easy to implement1 in Workbench in a way that works with all dispatch setups
- Corresponds to reasonable user expectations ("it shouldn't take 2 hours to start a container")
- "Lots of other users' containers are queued ahead of yours" looks identical to "nothing is running at all" (assuming user is not admin)
- "Cluster is at capacity, with long-running containers" looks identical to "cluster is unable to run anything at all"
- Doesn't take advantage of the metrics we are (or could be) tracking in arvados-dispatch-cloud, like recent queued-to-starting delays and # busy/idle/booting cloud instances.
1 Assuming we aren't too picky about the definition of "oldest" -- currently we don't record how long a container has been ready to run, only when it was created (since when it might have spent lots of time having priority=0) and when it was last modified (at which point it might have merely raised its priority long after it was ready to run)
Updated by Tom Clegg over 5 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 1.0
Updated by Peter Amstutz over 5 years ago
- Related to Idea #15133: Remove crunch v1 (jobs api) added
Updated by Tom Clegg over 5 years ago
- Related to Bug #15036: [Crunch2] Idle/busy node count not accurate if crunch1 not running added
Updated by Tom Clegg over 5 years ago
- Related to Bug #15014: [Workbench] Hide busy/idle nodes display when crunch1 is not active added
Updated by Tom Morris over 5 years ago
- Related to deleted (Idea #15133: Remove crunch v1 (jobs api))
Updated by Tom Morris over 5 years ago
- Blocks Idea #15133: Remove crunch v1 (jobs api) added
Updated by Peter Amstutz over 5 years ago
- Assigned To set to Peter Amstutz
- Target version changed from Arvados Future Sprints to 2019-06-19 Sprint
Updated by Peter Amstutz over 5 years ago
Do we actually want the oldest queued container, or the age of container at the top of the queue? (which would be highest priority)?
Updated by Peter Amstutz over 5 years ago
15087-wb-queued-containers @ 6b9eafb0de63da57e7b1a3945e7d16823e1c25df
Adds a new panel displaying pending/running containers and age of oldest container and longest running container.
Updated by Lucas Di Pentima over 5 years ago
Tried it locally by running some simple jobs on arvbox, seems to work ok! Just one comment:
- I noticed that no tests were modified/added — there’re some failing at https://ci.curoverse.com/job/developer-run-tests/1308/
Updated by Peter Amstutz over 5 years ago
15087-wb-queued-containers @ 1cf8673787f9aa62d2a63212522f883c867219af
https://ci.curoverse.com/view/Developer/job/developer-run-tests/1310/
Fixed tests (hopefully).
Updated by Peter Amstutz over 5 years ago
15087-wb-queued-containers @ 466ae87ff214d37c0765ee64845941adcbae8af4
Added links to the oldest / longest running container. Tweaked the layout. Re-running tests:
https://ci.curoverse.com/view/Developer/job/developer-run-tests/1315/
Updated by Peter Amstutz over 5 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|bf9803ee5afb33231da7900dddfdfac34b7056a6.