[Workbench] Update job stats to be more useful to end user
As was highlighted during the grooming #10516, our job stats could be more useful to user. This needs to be refined, but represents a first cut.
Here's an example of a current CWL job:
This pipeline started at 9:34 AM 12/7/2016. It failed after 57m at 10:31 AM 12/7/2016.
It ran for 6m(51mqueued) and used 6m of node allocation time (1.0⨯ scaling).
created_at: 9:34 AM 12/7/2016
started_at: 9:34 AM 12/7/2016
finished_at: 10:31 AM 12/7/2016
Failed 6m / 6m (1.0⨯) Output of cwl-runner
This job started at 10:25 AM 12/7/2016. It failed after 6m at 10:31 AM 12/7/2016.
It ran for 1m(5mqueued) and used 1m of node allocation time (1.0⨯ scaling).
Some things that I think we could improve:
- don't call the time the job was submitted a "started" time
- don't report top level CWL job separately from the overall workflow since the user basically considers them both part of the system machinery
Stats that are of interest to user:
- total wall clock time
- total core hours
- total cost
(for above three, both actuals for this run & totals with stats from previously run & reused jobs)
- queueing time until the first job in the workflow started
- queuing & overhead (separately?) time during the execution of the workflow
- ? maximum width / parallelism ?
- for individual jobs: keep cache hit rate & utilization, maximum CPU utilization, & other useful stats from crunchstat-summary