Feature #15087

[Workbench] Show number of queued containers on dashboard (instead of busy/idle nodes)

Added by Tom Clegg about 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench
Target version:
Start date:
06/14/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0
Release relationship:
Auto

Description

Background: Currently Workbench1 has "busy/idle nodes" counters on the dashboard, but they stop working or disappear if the deprecated crunch1 services are not running. This issue suggests a low-cost way to maintain some semblance of an "is anything happening?" indicator on Workbench after migrating to crunch2.

Feature: On the Workbench1 dashboard, if crunch2 is enabled, show
  • the number of containers (visible to the current user) that have state=Locked, or state=Queued and priority>0
  • the number of containers (visible to the current user) that have Running
  • time since the earliest start time of any running container (visible to the current user)
  • how long the oldest visible queued container has been waiting
Benefits:
  • Easy to implement1 in Workbench in a way that works with all dispatch setups
  • Corresponds to reasonable user expectations ("it shouldn't take 2 hours to start a container")
Shortcomings:
  • "Lots of other users' containers are queued ahead of yours" looks identical to "nothing is running at all" (assuming user is not admin)
  • "Cluster is at capacity, with long-running containers" looks identical to "cluster is unable to run anything at all"
  • Doesn't take advantage of the metrics we are (or could be) tracking in arvados-dispatch-cloud, like recent queued-to-starting delays and # busy/idle/booting cloud instances.

1 Assuming we aren't too picky about the definition of "oldest" -- currently we don't record how long a container has been ready to run, only when it was created (since when it might have spent lots of time having priority=0) and when it was last modified (at which point it might have merely raised its priority long after it was ready to run)


Subtasks

Task #15325: Review 15087-wb-queued-containersResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #15036: [Crunch2] Idle/busy node count not accurate if crunch1 not runningDuplicate

Related to Arvados - Bug #15014: [Workbench] Hide busy/idle nodes display when crunch1 is not activeResolved09/30/2019

Blocks Arvados - Story #15133: Remove crunch v1 (jobs api)Resolved08/08/2019

Associated revisions

Revision bf9803ee
Added by Peter Amstutz 12 months ago

Merge branch '15087-wb-queued-containers' closes #15087

Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <>

History

#1 Updated by Tom Clegg about 1 year ago

  • Description updated (diff)

#2 Updated by Tom Clegg about 1 year ago

  • Description updated (diff)

#3 Updated by Tom Clegg about 1 year ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 1.0

#4 Updated by Peter Amstutz about 1 year ago

  • Related to Story #15133: Remove crunch v1 (jobs api) added

#5 Updated by Tom Clegg about 1 year ago

  • Related to Bug #15036: [Crunch2] Idle/busy node count not accurate if crunch1 not running added

#6 Updated by Tom Clegg about 1 year ago

  • Related to Bug #15014: [Workbench] Hide busy/idle nodes display when crunch1 is not active added

#7 Updated by Tom Morris about 1 year ago

  • Related to deleted (Story #15133: Remove crunch v1 (jobs api))

#8 Updated by Tom Morris about 1 year ago

#9 Updated by Peter Amstutz 12 months ago

  • Assigned To set to Peter Amstutz
  • Target version changed from Arvados Future Sprints to 2019-06-19 Sprint

#10 Updated by Ward Vandewege 12 months ago

  • Release set to 22

#11 Updated by Tom Clegg 12 months ago

  • Description updated (diff)

#12 Updated by Peter Amstutz 12 months ago

Do we actually want the oldest queued container, or the age of container at the top of the queue? (which would be highest priority)?

#13 Updated by Peter Amstutz 12 months ago

15087-wb-queued-containers @ 6b9eafb0de63da57e7b1a3945e7d16823e1c25df

Adds a new panel displaying pending/running containers and age of oldest container and longest running container.

#14 Updated by Lucas Di Pentima 12 months ago

Tried it locally by running some simple jobs on arvbox, seems to work ok! Just one comment:

#16 Updated by Peter Amstutz 12 months ago

15087-wb-queued-containers @ 466ae87ff214d37c0765ee64845941adcbae8af4

Added links to the oldest / longest running container. Tweaked the layout. Re-running tests:

https://ci.curoverse.com/view/Developer/job/developer-run-tests/1315/

#17 Updated by Lucas Di Pentima 12 months ago

Latest updates LGTM, thanks.

#18 Updated by Peter Amstutz 12 months ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF