Feature #22157
Updated by Peter Amstutz 3 months ago
It would be very helpful to have the high water marks for CPU and RAM usage of a container recorded on the container record. There's two main uses I can think of:
* Detecting likely OOM conditions. Right now, arvados-cwl-runner parses crunch-run.log to get this information and put it as a runtime status warning. This means if arvados-cwl-runner itself goes OOM, it doesn't get flagged to the user.
* Detecting under-utilization. If CPU or RAM usage is < 50% for the whole run, it's a candidate for reducing the resource request. However, before changing the workflow, it is important to know if the step is consistently under-utilizing its resources, or if it's highly variable. Having this in the database makes the information way more accessible than having to parse logs.