Project

General

Profile

Actions

Feature #20383

open

Monitoring that gives list of compute containers that don't seem to be making progress

Added by Peter Amstutz about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Story points:
-

Description

Want to get a real time list of containers who's CPU usage and I/O usage are very low indicating it isn't doing any work.

Actions #1

Updated by Peter Amstutz about 1 year ago

  • Subject changed from Monitoring that gives list of "idle" compute nodes to Monitoring that gives list of compute containers that don't seem to be making progress
Actions #2

Updated by Peter Amstutz about 1 year ago

  • Description updated (diff)
Actions #3

Updated by Brett Smith about 1 year ago

What about CUDA jobs? If they're pegging the GPU but nothing else, is that reported? Can they be excluded from this list?

Actions

Also available in: Atom PDF