Project

General

Profile

Actions

Feature #15087

closed

[Workbench] Show number of queued containers on dashboard (instead of busy/idle nodes)

Added by Tom Clegg over 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench
Target version:
Start date:
06/14/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0
Release relationship:
Auto

Description

Background: Currently Workbench1 has "busy/idle nodes" counters on the dashboard, but they stop working or disappear if the deprecated crunch1 services are not running. This issue suggests a low-cost way to maintain some semblance of an "is anything happening?" indicator on Workbench after migrating to crunch2.

Feature: On the Workbench1 dashboard, if crunch2 is enabled, show
  • the number of containers (visible to the current user) that have state=Locked, or state=Queued and priority>0
  • the number of containers (visible to the current user) that have Running
  • time since the earliest start time of any running container (visible to the current user)
  • how long the oldest visible queued container has been waiting
Benefits:
  • Easy to implement1 in Workbench in a way that works with all dispatch setups
  • Corresponds to reasonable user expectations ("it shouldn't take 2 hours to start a container")
Shortcomings:
  • "Lots of other users' containers are queued ahead of yours" looks identical to "nothing is running at all" (assuming user is not admin)
  • "Cluster is at capacity, with long-running containers" looks identical to "cluster is unable to run anything at all"
  • Doesn't take advantage of the metrics we are (or could be) tracking in arvados-dispatch-cloud, like recent queued-to-starting delays and # busy/idle/booting cloud instances.

1 Assuming we aren't too picky about the definition of "oldest" -- currently we don't record how long a container has been ready to run, only when it was created (since when it might have spent lots of time having priority=0) and when it was last modified (at which point it might have merely raised its priority long after it was ready to run)


Subtasks 1 (0 open1 closed)

Task #15325: Review 15087-wb-queued-containersResolvedPeter Amstutz06/14/2019

Actions

Related issues

Related to Arvados - Bug #15036: [Crunch2] Idle/busy node count not accurate if crunch1 not runningDuplicate

Actions
Related to Arvados - Bug #15014: [Workbench] Hide busy/idle nodes display when crunch1 is not activeResolvedTom Clegg09/30/2019

Actions
Blocks Arvados - Story #15133: Remove crunch v1 (jobs api)ResolvedPeter Amstutz08/08/2019

Actions
Actions #1

Updated by Tom Clegg over 3 years ago

  • Description updated (diff)
Actions #2

Updated by Tom Clegg over 3 years ago

  • Description updated (diff)
Actions #3

Updated by Tom Clegg over 3 years ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 1.0
Actions #4

Updated by Peter Amstutz over 3 years ago

  • Related to Story #15133: Remove crunch v1 (jobs api) added
Actions #5

Updated by Tom Clegg about 3 years ago

  • Related to Bug #15036: [Crunch2] Idle/busy node count not accurate if crunch1 not running added
Actions #6

Updated by Tom Clegg about 3 years ago

  • Related to Bug #15014: [Workbench] Hide busy/idle nodes display when crunch1 is not active added
Actions #7

Updated by Tom Morris about 3 years ago

  • Related to deleted (Story #15133: Remove crunch v1 (jobs api))
Actions #8

Updated by Tom Morris about 3 years ago

Actions #9

Updated by Peter Amstutz about 3 years ago

  • Assigned To set to Peter Amstutz
  • Target version changed from Arvados Future Sprints to 2019-06-19 Sprint
Actions #10

Updated by Ward Vandewege about 3 years ago

  • Release set to 22
Actions #11

Updated by Tom Clegg about 3 years ago

  • Description updated (diff)
Actions #12

Updated by Peter Amstutz about 3 years ago

Do we actually want the oldest queued container, or the age of container at the top of the queue? (which would be highest priority)?

Actions #13

Updated by Peter Amstutz about 3 years ago

15087-wb-queued-containers @ 6b9eafb0de63da57e7b1a3945e7d16823e1c25df

Adds a new panel displaying pending/running containers and age of oldest container and longest running container.

Actions #14

Updated by Lucas Di Pentima about 3 years ago

Tried it locally by running some simple jobs on arvbox, seems to work ok! Just one comment:

Actions #16

Updated by Peter Amstutz about 3 years ago

15087-wb-queued-containers @ 466ae87ff214d37c0765ee64845941adcbae8af4

Added links to the oldest / longest running container. Tweaked the layout. Re-running tests:

https://ci.curoverse.com/view/Developer/job/developer-run-tests/1315/

Actions #17

Updated by Lucas Di Pentima about 3 years ago

Latest updates LGTM, thanks.

Actions #18

Updated by Peter Amstutz about 3 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF