Bug #18264

[CI] simplify the way we run the CWL tests

Added by Ward Vandewege 9 days ago. Updated about 4 hours ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

We currently run the CWL tests on our test clusters by launching a custom script on the main jenkins server which copies and runs a custom script to the shell node of the cluster. This is convoluted and error prone. Make some changes:

  • instead of running with -j1, increase the parallelism to whatever the target cluster can handle (the bottleneck is the machine that runs a-c-r !)
  • instead of relying on a shell node, just start a jenkins satellite with appropriate Arvados credentials for the target cluster and run the test suite that way
  • instead of having one CI job for the upstream CWL test suite and our Arvados CWL tests, make those 2 jobs and run them in parallel (if the target cluster can handle that)
  • if possible, instead of one (a pair of) CI jobs for each cluster, make a parameterized job that is launched with the appropriate parameters in the build pipeline for each cluster

Subtasks

Task #18271: reviewNewPeter Amstutz


Related issues

Blocked by Arvados - Bug #18238: CWL integration test failingResolved

History

#1 Updated by Ward Vandewege 9 days ago

  • Status changed from New to In Progress

#2 Updated by Ward Vandewege 9 days ago

  • Description updated (diff)

#3 Updated by Ward Vandewege 9 days ago

Ready for review at commit:e4376aca8fd1e81a03b8534cab6cbd07220c45b9 on branch 18264-cwl-testing in the arvados-dev repo

I've made an example of the corresponding CI changes that I'm going to set up in

https://ci.arvados.org/view/Developer/job/developer-diagnostics-9tee4/

That job has 2 downstream projects:

developer-run-tests-arvados-cwl
developer-run-tests-cwl-suite

which are invoked with the appropriate parameters (cluster_id 9tee4).

So; in the build pipeline, I'm planning to:

  • decommission run-cwl-test-9tee4
  • replace it with a copy of developer-run-tests-arvados-cwl and developer-run-tests-cwl-suite which will be invoked with $cluster_id set to 9tee4 and run in parallel
  • rinse and repeat for ce8i5 and tordo

At some point we can do the equivalent cleanup for the deploy-to-XXXXX and diagnostics-XXXXX CI jobs, and consolidate those into one job with a parameter.

#4 Updated by Peter Amstutz 8 days ago

  • Target version changed from 2021-10-13 sprint to 2021-10-27 sprint

#5 Updated by Ward Vandewege 8 days ago

A few more things:

  • we should be running the 1.2 version of the conformance tests (done as of commit:7e0e0601f5f20003db4e8955503edfc8e003dd8f on branch 18264-cwl-testing in the arvados-dev repo)
  • there are test failures in the 1.2 version of the conformance tests, due to a bug in a-c-r, cf. https://dev.arvados.org/issues/18238#note-6, waiting for a fix there
  • sort out the use of git_hash among all the jobs (sometimes it's the arvados repo hash - correct - sometimes it is the arvados-dev repo)
  • make sure the ci job installs (on the satellite node) the exact version of of the packages that corresponds to git_hash

#6 Updated by Ward Vandewege 8 days ago

  • Blocked by Bug #18238: CWL integration test failing added

Also available in: Atom PDF