Expand arvados-controller to expose forecast features
Arvados-controller forecast endpoints should be created based on the work done in git.arvados.org:arvados-forecaster.git
Added /arvados/v1/container_requests/UUID/datapoints endpoint
This is the first commit to try to implement Historical Forecasting
This will enable to create Gantt diagrams that will help visualize the
times and dependencies of a run.
Arvados-DCO-1.1-Signed-off-by: Nico Cesar <email@example.com>
#3 Updated by Nico César 8 months ago
- File su92l-xvhdp-3gri0mi1vtakaf4.png su92l-xvhdp-3gri0mi1vtakaf4.png added
- File su92l-xvhdp-3gri0mi1vtakaf4_just_circles.dot su92l-xvhdp-3gri0mi1vtakaf4_just_circles.dot added
- File su92l-xvhdp-3gri0mi1vtakaf4_intermediate_graph.pdf su92l-xvhdp-3gri0mi1vtakaf4_intermediate_graph.pdf added
In order to determine dependencies I added a "<workflow>-start" and "<workflow>-end" in the intermediate graph you can see this in: su92l-xvhdp-3gri0mi1vtakaf4_intermediate_graph.pdf
Output.Source and Input.Source have been a source of trouble to connect the dots to make dependencies work.
#4 Updated by Nico César 7 months ago
Longest running haplotypecaller: 1d21h49m:
https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-iqy6soi7sz4runm (Notice that the command doesn't have the -L)
While most of the other ones are 2h to 4h range, as an example haplotypecaller_9:
As we discussed before, we can't take just the name of the step to do the bookkeeping of the metrics (running time in this case, but more to come),
Currently we use: "duration:<checkpoint>#<containerUUID>" as the key where checkpoint is "haplotypecaller" and anything after "#" is ignored on summarizing results.
- get the parent container request that has a workflow.json to compare to "the family of workflows" and use that as part of the key when storing the
- make some kind of signature of the command used for that Container, this will take into account the command line, but also get a little more creative and have input parsing and having some extra metrics comparing input sizes for example.
- a mix of 1 and 2, starting with one and try to get a sense of the clustering happens for all data
#8 Updated by Nico César 7 months ago
- Target version changed from 2020-06-17 Sprint to 2020-07-01 Sprint
After talking to Lucas and Tom this is the WIP: https://dev.arvados.org/projects/arvados/wiki/API_HistoricalForcasting_data_for_CR
(for later review)
#18 Updated by Nico César 4 months ago
Note to future self/reviewer : make sure GET https://<API>/arvados/v1/container_requests/ works as expected
#22 Updated by Nico César 3 months ago
16462-forecast-wip2 (f6ccc08c3f6b1ad42f2c827b19df0300f2c3c3db) can be reviewed now, This is unfinished work but might be useful to know if I'm heading in the right direction.
I see that some of my changes broke federation:
#29 Updated by Nico César about 1 month ago
c0916b956054b2e16ff61bd33b11bfe07e81787d branch 16462-CR-datapoints