Idea #10111
closed[Workbench][Crunch2] Provenance graph for Container Request
Description
Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.
Updated by Tom Morris over 8 years ago
- Subject changed from Provenance graph for Container Request to [Crunch2] Provenance graph for Container Request
Updated by Tom Morris about 8 years ago
- Story points set to 2.0
Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.
Updated by Tom Clegg about 8 years ago
- Target version changed from 2016-11-23 sprint to Arvados Future Sprints
Updated by Tom Morris almost 8 years ago
- Target version changed from Arvados Future Sprints to 2017-04-12 sprint
Updated by Tom Morris almost 8 years ago
- Tracker changed from Bug to Idea
- Subject changed from [Crunch2] Provenance graph for Container Request to [Workbench][Crunch2] Provenance graph for Container Request
- Assigned To set to Lucas Di Pentima
Updated by Lucas Di Pentima over 7 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 7 years ago
- Target version changed from 2017-04-12 sprint to 2017-04-26 sprint
Updated by Lucas Di Pentima over 7 years ago
Updates at 93c92875aaebe5b06f8dbfe2822b59a772895c08 (branch 10111-cr-provenance-graph
)
Radhika: While I start working on the tests, I would like to check with you if this is the correct approach, and if there are missing elements on the graph to be included.
Updated by Lucas Di Pentima over 7 years ago
Merged master && added some tests: 4ccbea9ef
Test run: https://ci.curoverse.com/job/developer-run-tests/247/
Updated by Radhika Chippada over 7 years ago
For a CR -> Use Inputs from Mounts + output uuid + log uuid as the nodes
And then need to get all child CRs for this CR and repeat the above.
Updated by Lucas Di Pentima over 7 years ago
Updates at 30146198f
Test run: https://ci.curoverse.com/job/developer-run-tests/251/
- Removed container_image and requesting_container from the graph.
- Added child CRs with their own mounts/output/log.
Updated by Radhika Chippada over 7 years ago
The graph is looking pretty good. A few observations:
- As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?
- extra white space at line ends in _show_provenance.html.erb
Thanks.
Updated by Lucas Di Pentima over 7 years ago
Updates: 260e85a9d
- As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?
Ah yes, I thought that all mounts that specified a UUID/PDH were implicitly an input.
I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json
mount, that as I understand by reading the get_cwl_inputs()
helper, it's the object describing the input mounts.
The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa
command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance
), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner
on the command?
- extra white space at line ends in _show_provenance.html.erb
Oops! done.
Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?
Updated by Radhika Chippada over 7 years ago
I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json mount, that as I understand by reading the get_cwl_inputs() helper, it's the object describing the input mounts. The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner on the command?
Yes, this seems problematic. I think we should check for /keep/<pdh> format instead in mounts to get the input collections. Please confirm with Peter. Thanks.
Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?
Comparing with pipeline_instances and jobs, this seems correct (to show graph for Queued etc)
Updated by Lucas Di Pentima over 7 years ago
Update at: 88c241d7c
Test run will be on: https://ci.curoverse.com/job/developer-run-tests/257/
Search for all PDHs on "mounts" on cases when cwl.input.json
is not included. As talked with Radhika & Bryan, outputs aren't listed using PDHs, just paths. So there's no possibility of including an output as an input.
Updated by Lucas Di Pentima over 7 years ago
Update at: edfc619e6
As requested on the sprint review meeting, changed the graph edges from "cr" to "child" and "mounts" to "input".
Updated by Lucas Di Pentima over 7 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:b075d1be1377760f5d8497a29f63c8e416cd5378.
Updated by Peter Amstutz over 7 years ago
- Status changed from Resolved to Feedback
- Target version changed from 2017-04-26 sprint to 2017-05-10 sprint
Additional comments:
- Needs to use PDH so that inputs and output match up. For example, in this graph the output of "rev" is supposed to be an input of "sort": https://workbench.9tee4.arvadosapi.com/container_requests/9tee4-xvhdp-w382mn52hn18oad#Provenance
- Label collection inputs by name. If the collection shows up under multiple different names, prefer the name of the collection in the current project. Otherwise pick any name and render it something like "HWI-ST1027_129_D0THKACXX for CWL tutorial + 4 more"
- Don't render "log" outputs. They are just clutter.
- I'm not sure if its a good idea to render "child" links. If you have 300 child containers it is just a lot of lines providing very little information. Consider using a graphviz "subgraph" or "cluster".
- To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like
"portable_data_hash": "abc+123"
and"location": "keep:abc+123"
Updated by Peter Amstutz over 7 years ago
In the interests of time, let's limit it to these changes:
- Needs to use PDH so that inputs and output match up. For example, in this graph the output of "rev" is supposed to be an input of "sort": https://workbench.9tee4.arvadosapi.com/container_requests/9tee4-xvhdp-w382mn52hn18oad#Provenance (this is the most important change, because the current behavior it is effectively a regression from the equivalent functionality for jobs)
- Don't render "log" outputs. They are just clutter.
- To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like "portable_data_hash": "abc+123" and "location": "keep:abc+123" (this should ensure that nothing is missed)
Updated by Lucas Di Pentima over 7 years ago
Fixes at a4a8d41f6 - branch 10111-cr-prov-regression-fixes
Test run: https://ci.curoverse.com/job/developer-run-tests/273/
Updated by Lucas Di Pentima over 7 years ago
Branch 10111-collection-labels
- commit 39755f764
Test run: https://ci.curoverse.com/job/developer-run-tests/274/
Added better collection labelling on CR provenance graph.
Updated by Lucas Di Pentima over 7 years ago
An integration test was failing, updated fix at e01823785
New test run: https://ci.curoverse.com/job/developer-run-tests/275/
Updated by Peter Amstutz over 7 years ago
Additional comment: where we have a container request with an explicit output_uuid, make sure to use the label corresponding to the name of the collection in output_uuid, before falling back on the logic outlined in note-24
Updated by Lucas Di Pentima over 7 years ago
Updates at 4259263d2
Test run: https://ci.curoverse.com/job/developer-run-tests/276/
Addressed issue about naming output collections after the cr's output_uuid collection reference.
Updated by Peter Amstutz over 7 years ago
How hard would it be to fix the hyperlinks so that when you have a specific UUID associated with a collection, clicking on it takes you directly to it and not to the "this PDH has multiple collections" page?
Updated by Peter Amstutz over 7 years ago
Another note. For labeling, if there are multiple collections but they have the same name, you don't need the "+N more"
It's making a separate API call for every collection. That adds a lot of latency. It should find all the PDHs in the graph, make a batch request for them all, and then filter on the workbench side.
Updated by Lucas Di Pentima over 7 years ago
Updates at b29ca38e4
Test run: https://ci.curoverse.com/job/developer-run-tests/278/
- Refactored the graph creation code for CR so that it minimizes the amount of API calls when looking for information about outputs, inputs and childs.
- For input collections, when there are more than one with the same name, don't add the "+N more" to the name label.
- For output collections, added an option on
describe_node()
helper function so that the graph node is referenced by PDH, but link urls are rendered by UUID so they take the user to the specific collection page when clicking on it.
Updated by Lucas Di Pentima over 7 years ago
- Target version changed from 2017-05-10 sprint to 2017-05-24 sprint
Updated by Peter Amstutz over 7 years ago
Ok, for large workflows, it still takes forever to load, but it seems that "dot" is the bottleneck now. We need to rethink representation, but not for this story (I'm putting that on a new ticket, #11680).
On the implementation:
The intended way to call GenerateGraph() was with pdata
to contain all the nodes that will be used in the graph. In order to have better separation of concerns, would it make sense for the new code in generate_provenance_edges()
that does the batch queries to move to container_requests_controller#generate_provenance
?
Updated by Lucas Di Pentima over 7 years ago
Updates at 795bf007c
Test run: https://ci.curoverse.com/job/developer-run-tests/284/
Moved the code related to API requests to the CR controller.
Updated by Peter Amstutz over 7 years ago
Thanks. This is a much better separation of concerns.
I'm unhappy with how it behaves with large graphs, but instead of continuing to go around back and forth I think we should merge 795bf007cbe24775bd348fb40fc5c28d93c8f23d and schedule a grooming session to figure out how rendering can be improved.
LGTM.
Updated by Lucas Di Pentima over 7 years ago
- Status changed from Feedback to Resolved
- % Done changed from 33 to 100
Applied in changeset arvados|commit:f5fbc48810d1397df9e6244c16cf07c05162d36a.