tiling workflow cancelled for unknown reason
Running tiling workflow but it gets cancelled. https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-mzrysxcgtubgva9
I tried various run time constraints and workflow parameters, but they all get cancelled.
Before su92l was upgraded, I ran a workflow of the same scale (input also around 2TB), and it was successful. https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-nm507pzmjqiai4s
Contrasting individual jobs from these two runs, https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-vdlq5f0hqldttso completed but https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-t3dtsqsi3vqfetb is cancelled.
#1 Updated by Jiayong Li about 1 month ago
I changed "no_listing" from "hints" to "requirements", still failed https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-jqx484v754z4vzl
#2 Updated by Lucas Di Pentima about 1 month ago
- Target version changed from To Be Groomed to 2020-02-26 Sprint
- Assigned To set to Lucas Di Pentima
- Status changed from New to In Progress
- Category set to Crunch
It seems that the container is getting OOM-killed.
We're also getting a warning on the log:
Warning: cwltool: ../../lib/cwl/workflow.json:1:25668: Recursive directory listing has resulted in a large number of File objects (1733821) passed to the input parameter 'fjdir'. This may negatively affect workflow performance and memory use. If this is a problem, use the hint 'cwltool:LoadListingRequirement' with "shallow_listing" or "no_listing" to change the directory listing behavior: $namespaces: cwltool: "http://commonwl.org/cwltool#" hints: cwltool:LoadListingRequirement: loadListing: shallow_listing
...but the workflow already has the
no_listing hint from previous (pre 2.0) successful runs. Maybe this hint is being ignored?
#3 Updated by Jiayong Li about 1 month ago
specifying "no_listing" on the workflow got ignored
but specifying "no_listing" on the job level works