Bug #18311

[cwl] test 221 in the 1.2 conformance suite is failing on singularity

Added by Ward Vandewege about 1 month ago. Updated 12 days ago.

Status:
New
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

On our singularity clusters (9tee4 and tordo, currently), the failure is (from https://ci.arvados.org/view/Arvados%20build%20pipeline/job/run-tests-cwl-suite/37/console):

21:57:52 Test 221 failed: /usr/bin/arvados-cwl-runner --compute-checksum --disable-reuse --eval-timeout 60 --outdir=/tmp/tmpxpfu_s_d --quiet tests/timelimit2-wf.cwl tests/empty.json
21:57:52 Test that workflow level time limit is not applied to workflow execution time
21:57:52 Returned non-zero
21:57:52 ERROR [container timelimit2-wf.cwl] (9tee4-dz642-j9qil6yio4ms1ku) error log:
21:57:52 
21:57:52   2021-10-29T01:56:31.148851065Z crunch-run Loading Docker image from keep
21:57:52   2021-10-29T01:56:31.641354946Z crunch-run Starting container
21:57:52   2021-10-29T01:56:31.642351277Z crunch-run Waiting for container to finish
21:57:52   2021-10-29T01:56:37.397845753Z stderr INFO /usr/bin/arvados-cwl-runner 2.3.0.dev20211020182823, arvados-python-client 2.3.0.dev20211013202728, cwltool 3.1.20211020155521
21:57:52   2021-10-29T01:56:37.414865353Z stderr INFO Resolved '/var/lib/cwl/workflow.json#main' to 'file:///var/lib/cwl/workflow.json#main'
21:57:52   2021-10-29T01:56:45.046750708Z stderr INFO Using cluster 9tee4 (https://workbench2.9tee4.arvadosapi.com/)
21:57:52   2021-10-29T01:56:45.558323744Z stderr INFO Upload local files: "workflow.json" 
21:57:52   2021-10-29T01:56:45.758393815Z stderr INFO Uploaded to f9d6a1cda3285e35b4b70234c4712b95+61 (9tee4-4zz18-wn85zp8etf12ora)
21:57:52   2021-10-29T01:56:45.763297415Z stderr INFO Upload local files: "workflow.json" 
21:57:52   2021-10-29T01:56:45.887338055Z stderr INFO Uploaded to f9d6a1cda3285e35b4b70234c4712b95+61 (9tee4-4zz18-35bm0g8ul56y4no)
21:57:52   2021-10-29T01:56:50.325226086Z stderr INFO Using collection cache size 256 MiB
21:57:52   2021-10-29T01:56:50.574797297Z stderr INFO Running inside container 9tee4-dz642-j9qil6yio4ms1ku
21:57:52   2021-10-29T01:56:50.596940754Z stderr INFO [workflow workflow.json#main] start
21:57:52   2021-10-29T01:56:50.597164015Z stderr INFO [workflow workflow.json#main] starting step step1
21:57:52   2021-10-29T01:56:50.597407019Z stderr INFO [step step1] start
21:57:52   2021-10-29T01:56:50.759170131Z stderr INFO [container step1] 9tee4-xvhdp-njas6p46cvav01s state is Committed
21:57:52   2021-10-29T01:57:38.633382081Z stderr INFO [container step1] 9tee4-xvhdp-njas6p46cvav01s is Final
21:57:52   2021-10-29T01:57:38.720112167Z stderr ERROR [container step1] (9tee4-xvhdp-njas6p46cvav01s) error log:
21:57:52   2021-10-29T01:57:38.720112167Z stderr
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.484569438Z crunch-run Not starting a gateway server (GatewayAuthSecret was not provided by dispatcher)
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.484755292Z crunch-run crunch-run 2.3.0 (go1.17.1) started
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.484776349Z crunch-run Executing container '9tee4-dz642-szbmvhrdakmb0ha' using singularity runtime
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.484839635Z crunch-run Executing on host 'compute1.9tee4.arvadosapi.com'
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.604062194Z crunch-run container token "v2/9tee4-gj3su-eaaf5ru6ohnw7d2/36sq5h3y5f0qo1s7qlgzrn9hfn4h851ufstfkpjku5sh76i92j/9tee4-dz642-szbmvhrdakmb0ha" 
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:20.604359220Z crunch-run Running [arv-mount --foreground --read-write --storage-classes default --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id --mount-by-id by_uuid /tmp/crunch-run.9tee4-dz642-szbmvhrdakmb0ha.3491659512/keep4170174168]
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:21.510019953Z crunch-run Fetching Docker image from collection 'd2a6f06e1f8e3e7d72b8ee89622e9f96+261'
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:21.576950500Z crunch-run Using Docker image id "sha256:61064933b465210ec06517e95e64b0909841d4ccf037552266d7079baed45a6e" 
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:21.576984357Z crunch-run Loading Docker image from keep
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:22.013919659Z crunch-run Starting container
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:22.014675984Z crunch-run Waiting for container to finish
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:27.015064319Z crunch-run maximum run time exceeded. Stopping container.
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:27.015148637Z crunch-run stopping container
21:57:52   2021-10-29T01:57:38.720112167Z stderr   2021-10-29T01:57:27.123407016Z crunch-run Cancelled
21:57:52   2021-10-29T01:57:38.817582951Z stderr ERROR [step step1] Output is missing expected field file:///var/lib/cwl/workflow.json#main/step1/o
21:57:52   2021-10-29T01:57:38.900999492Z stderr WARNING [step step1] completed permanentFail
21:57:52   2021-10-29T01:57:38.976870679Z stderr INFO [workflow workflow.json#main] completed permanentFail
21:57:52   2021-10-29T01:57:38.976917303Z stderr ERROR Overall process status is permanentFail
21:57:52   2021-10-29T01:57:39.142930716Z stderr INFO Final output collection 42866e7c2f47b2fd38dca68903e59c4b+59 "Output of main (2021-10-29T01:57:39.107Z)" (9tee4-4zz18-yr16og3bk6qwk3q)
21:57:52   2021-10-29T01:57:39.374321052Z stderr WARNING Final process status is permanentFail
21:57:52   2021-10-29T01:57:39.956477426Z crunch-run Complete
21:57:52 ERROR Overall process status is permanentFail
21:57:52 WARNING Final process status is permanentFail

And on 9tee4 (from https://ci.arvados.org/view/Arvados%20build%20pipeline/job/run-tests-cwl-suite/40/console):

11:24:02 Test 221 failed: /usr/bin/arvados-cwl-runner --compute-checksum --disable-reuse --eval-timeout 60 --outdir=/tmp/tmp_lnu_8e_ --quiet tests/timelimit2-wf.cwl tests/empty.json
11:24:02 Test that workflow level time limit is not applied to workflow execution time
11:24:02 Returned non-zero
11:24:02 ERROR [container timelimit2-wf.cwl] (9tee4-dz642-tggdcuakup1jfq5) error log:
11:24:02 
11:24:02   2021-10-29T15:22:53.175747990Z crunch-run Loading Docker image from keep
11:24:02   2021-10-29T15:22:53.849775581Z crunch-run Starting container
11:24:02   2021-10-29T15:22:53.850902104Z crunch-run Waiting for container to finish
11:24:02   2021-10-29T15:23:05.619135487Z stderr INFO /usr/bin/arvados-cwl-runner 2.3.0.dev20211020182823, arvados-python-client 2.3.0.dev20211013202728, cwltool 3.1.20211020155521
11:24:02   2021-10-29T15:23:05.636148789Z stderr INFO Resolved '/var/lib/cwl/workflow.json#main' to 'file:///var/lib/cwl/workflow.json#main'
11:24:02   2021-10-29T15:23:14.022887185Z stderr INFO Using cluster 9tee4 (https://workbench2.9tee4.arvadosapi.com/)
11:24:02   2021-10-29T15:23:14.279142453Z stderr INFO Upload local files: "workflow.json" 
11:24:02   2021-10-29T15:23:14.428674656Z stderr INFO Uploaded to f9d6a1cda3285e35b4b70234c4712b95+61 (9tee4-4zz18-9zab9j0xsgvigar)
11:24:02   2021-10-29T15:23:14.435330631Z stderr INFO Upload local files: "workflow.json" 
11:24:02   2021-10-29T15:23:14.571627543Z stderr INFO Uploaded to f9d6a1cda3285e35b4b70234c4712b95+61 (9tee4-4zz18-gjzewquov369heg)
11:24:02   2021-10-29T15:23:19.004648013Z stderr INFO Using collection cache size 256 MiB
11:24:02   2021-10-29T15:23:19.030485917Z stderr INFO Running inside container 9tee4-dz642-tggdcuakup1jfq5
11:24:02   2021-10-29T15:23:19.052813153Z stderr INFO [workflow workflow.json#main] start
11:24:02   2021-10-29T15:23:19.053081383Z stderr INFO [workflow workflow.json#main] starting step step1
11:24:02   2021-10-29T15:23:19.053338000Z stderr INFO [step step1] start
11:24:02   2021-10-29T15:23:19.230789355Z stderr INFO [container step1] 9tee4-xvhdp-qojf0j4ca8lyvnf state is Committed
11:24:02   2021-10-29T15:23:55.085180501Z stderr INFO [container step1] 9tee4-xvhdp-qojf0j4ca8lyvnf is Final
11:24:02   2021-10-29T15:23:55.176969339Z stderr ERROR [container step1] (9tee4-xvhdp-qojf0j4ca8lyvnf) error log:
11:24:02   2021-10-29T15:23:55.176969339Z stderr
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.345522001Z crunch-run Not starting a gateway server (GatewayAuthSecret was not provided by dispatcher)
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.345759158Z crunch-run crunch-run 2.3.0 (go1.17.1) started
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.345779799Z crunch-run Executing container '9tee4-dz642-e1vpz4lmmpzjdr4' using singularity runtime
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.345801566Z crunch-run Executing on host 'compute1.9tee4.arvadosapi.com'
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.459713326Z crunch-run container token "v2/9tee4-gj3su-ewn10p08oeckpxj/1eqyw1uioulmxl78olk0i743ipt4nro0neksh32ee48ks8rjux/9tee4-dz642-e1vpz4lmmpzjdr4" 
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:45.460034932Z crunch-run Running [arv-mount --foreground --read-write --storage-classes default --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id --mount-by-id by_uuid /tmp/crunch-run.9tee4-dz642-e1vpz4lmmpzjdr4.734327895/keep1570108756]
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:46.365999816Z crunch-run Fetching Docker image from collection 'd2a6f06e1f8e3e7d72b8ee89622e9f96+261'
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:46.430319132Z crunch-run Using Docker image id "sha256:61064933b465210ec06517e95e64b0909841d4ccf037552266d7079baed45a6e" 
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:46.430387977Z crunch-run Loading Docker image from keep
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:47.011343535Z crunch-run Starting container
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:47.012153992Z crunch-run Waiting for container to finish
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:52.012532501Z crunch-run maximum run time exceeded. Stopping container.
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:52.012641954Z crunch-run stopping container
11:24:02   2021-10-29T15:23:55.176969339Z stderr   2021-10-29T15:23:52.123251411Z crunch-run Cancelled
11:24:02   2021-10-29T15:23:55.259792247Z stderr ERROR [step step1] Output is missing expected field file:///var/lib/cwl/workflow.json#main/step1/o
11:24:02   2021-10-29T15:23:55.344468592Z stderr WARNING [step step1] completed permanentFail
11:24:02   2021-10-29T15:23:55.423478553Z stderr INFO [workflow workflow.json#main] completed permanentFail
11:24:02   2021-10-29T15:23:55.423519944Z stderr ERROR Overall process status is permanentFail
11:24:02   2021-10-29T15:23:55.572398801Z stderr INFO Final output collection 42866e7c2f47b2fd38dca68903e59c4b+59 "Output of main (2021-10-29T15:23:55.541Z)" (9tee4-4zz18-smx5y5kzo0oyh11)
11:24:02   2021-10-29T15:23:55.804266214Z stderr WARNING Final process status is permanentFail
11:24:02   2021-10-29T15:23:56.277148741Z crunch-run Complete
11:24:02 ERROR Overall process status is permanentFail
11:24:02 WARNING Final process status is permanentFail

Compare with ce8i5 where it passes (https://ci.arvados.org/view/Arvados%20build%20pipeline/job/run-tests-cwl-suite/38).

History

#1 Updated by Ward Vandewege about 1 month ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 26 days ago

  • Target version changed from 2021-11-10 sprint to 2021-11-24 sprint

#3 Updated by Peter Amstutz 12 days ago

  • Target version changed from 2021-11-24 sprint to 2021-12-08 sprint

Also available in: Atom PDF