Idea #21058
openLet users know if Crunch hit cloud capacity
Description
If Crunch hits some cloud capacity limit, then any user who is starting or running a workflow may want to know about that—not just the user(s) whose workflows were directly affected.
If Crunch recently hit one or more instance capacity limits in the last 8 hours (number bikesheddable), Workbench should display a red warning banner just under the top bar with a message like:
Arvados has been unable to create some instance types in the cloud recently:
- t0.example (as of DATETIME)
Workflow execution may be delayed until resources are available again.
Repeat the list items for each affected instance type in the last 8 hours.
If Crunch hit a cloud-wide capacity problem like network limit, the message can say:
Arvados has been unable to create new compute nodes as of DATETIME. Workflow execution may be delayed until resources are available again.
The banner can be dismissed. Once it is dismissed, it remains dismissed for the duration of the user's session, unless the error type changes (like a new instance type becomes unavailable, or we go from an instance problem to a cloud-wide problem). If that happens, the banner reappears with the new error information.
DATETIME should be rendered the same way all other absolute datetimes are rendered in Workbench.
Related issues
Updated by Brett Smith about 1 year ago
There should probably be two tickets: one for Crunch to expose this information in a way Workbench can get at it, and another for the UI implementation. This can be the UI ticket. I'm not sure what the Crunch implementation should be so I'll leave that for someone else to write.
Updated by Peter Amstutz about 1 year ago
- Related to Feature #21123: Add API that returns current dispatch/scheduling status for a specified container added