Project

General

Profile

Actions

Idea #21058

open

Let users know if Crunch hit cloud capacity

Added by Brett Smith 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Workbench2
Target version:
Start date:
Due date:
Story points:
-

Description

If Crunch hits some cloud capacity limit, then any user who is starting or running a workflow may want to know about that—not just the user(s) whose workflows were directly affected.

If Crunch recently hit one or more instance capacity limits in the last 8 hours (number bikesheddable), Workbench should display a red warning banner just under the top bar with a message like:

Arvados has been unable to create some instance types in the cloud recently:

  • t0.example (as of DATETIME)

Workflow execution may be delayed until resources are available again.

Repeat the list items for each affected instance type in the last 8 hours.

If Crunch hit a cloud-wide capacity problem like network limit, the message can say:

Arvados has been unable to create new compute nodes as of DATETIME. Workflow execution may be delayed until resources are available again.

The banner can be dismissed. Once it is dismissed, it remains dismissed for the duration of the user's session, unless the error type changes (like a new instance type becomes unavailable, or we go from an instance problem to a cloud-wide problem). If that happens, the banner reappears with the new error information.

DATETIME should be rendered the same way all other absolute datetimes are rendered in Workbench.


Related issues

Related to Arvados - Feature #21123: Add API that returns current dispatch/scheduling status for a specified containerResolvedTom Clegg03/15/2024Actions
Actions #1

Updated by Brett Smith 7 months ago

There should probably be two tickets: one for Crunch to expose this information in a way Workbench can get at it, and another for the UI implementation. This can be the UI ticket. I'm not sure what the Crunch implementation should be so I'll leave that for someone else to write.

Actions #2

Updated by Peter Amstutz 6 months ago

  • Related to Feature #21123: Add API that returns current dispatch/scheduling status for a specified container added
Actions

Also available in: Atom PDF