Story #17755

Test singularity support on a cloud cluster by running some real workflows

Added by Peter Amstutz 11 days ago. Updated 4 days ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

On a cloud cluster with singularity enabled (requires new compute image + config changes), run the the WGS workflow with small chr19 inputs, eg pirca-xvhdp-m4r8n9qmoh5ggra


Subtasks

Task #17796: ReviewNew


Related issues

Related to Arvados - Story #17296: Singularity proof of conceptResolved05/27/2021

History

#1 Updated by Peter Amstutz 11 days ago

  • Status changed from New to In Progress

#2 Updated by Peter Amstutz 11 days ago

  • Status changed from In Progress to New
  • Tracker changed from Bug to Story

#3 Updated by Peter Amstutz 11 days ago

  • Category set to Crunch

#4 Updated by Peter Amstutz 11 days ago

  • Description updated (diff)

#5 Updated by Peter Amstutz 7 days ago

  • Description updated (diff)
  • Subject changed from Test singularity support on 9tee4 by running some real workflows to Test singularity support on a cloud cluster by running some real workflows

#6 Updated by Peter Amstutz 6 days ago

  • Assigned To set to Ward Vandewege

#7 Updated by Ward Vandewege 4 days ago

  • Status changed from New to In Progress

#8 Updated by Ward Vandewege 4 days ago

  • Related to Story #17296: Singularity proof of concept added

#9 Updated by Ward Vandewege 4 days ago

I have built a new compute node image for ce8i5 that has the singularity binary 3.5.2 (that's an old version) in it, cf. commit: 784a3f24d37819186a52ea2c67e15e5bd8639076 on branch 17755-add-singularity-to-compute-image. That commit builds singularity from source while creating the packer image which seems suboptimal (it requires a lot of build-time dependencies, and it takes a while).

I've run the diagnostics workflow with it at ce8i5-xvhdp-el23s4fjsrp1mjb.

Observations:
  • the conversion of a ~240MiB docker tar file took over 5 minutes (ouch!!), about 4 of which seem to be in the invocation of `mksquashfs`
  • this conversion is invisible to the user, it's part of the "loading the image" stage so it's not considered when (e.g.) workbench shows workflow step durations.
  • the `hasher3` step ran on the same node as the `hasher1` step, but it repeated the singularity import step. Ouch.

ce8i5-xvhdp-ia0zn3njn6lwbdj: docker: 0m9s(6m queued)
ce8i5-xvhdp-el23s4fjsrp1mjb: singularity: 0m7s(11m queued)

Also available in: Atom PDF