Project

General

Profile

Actions

Bug #19962

open

singularity crunch-runner jobs cannot deal with a very large number of bind mounts

Added by Tom Schoonjans over 1 year ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

We have recently increasing the number of inputs to one of our workflow steps and ran into a problem when the number of collection pdhs referenced in the keep URIs surpassed a certain number (not sure but must be around 700-800) resulting in the following error:

2023-01-18T21:14:11.798930427Z crunch-run using local keepstore process (pid 2941334) at http://localhost:35053
2023-01-18T21:14:11.799136738Z crunch-run Not starting a gateway server (GatewayAuthSecret was not provided by dispatcher)
2023-01-18T21:14:11.799220190Z crunch-run crunch-run 2.4.1 (go1.17.1) started
2023-01-18T21:14:11.800955252Z crunch-run crunch-run process has uid=5678(crunch) gid=5678(crunch) groups=5678(crunch),119(docker)
2023-01-18T21:14:12.432869677Z crunch-run Using FUSE mount: /usr/bin/arv-mount 2.4.2
2023-01-18T21:14:12.462541344Z crunch-run Using container runtime: singularity-ce version 3.9.5-focal
2023-01-18T21:14:12.462593167Z crunch-run Executing container: arvc1-dz642-w0fp51xhnku51sp
2023-01-18T21:14:12.462604121Z crunch-run Executing on host 'slurm-worker-green-1'
2023-01-18T21:14:12.853840766Z crunch-run container token "v2/arvc1-gj3su-fpb2iyhh05b19z1/164wlivcxz9970wzxnmqpkxgjf2350foff36i9ao77bj2ugf7o/arvc1-dz642-w0fp51xhnku51sp"
2023-01-18T21:14:12.856283481Z crunch-run Running [arv-mount --foreground --read-write --storage-classes default --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id --disable-event-listening --mount-by-id by_uuid /tmp/crunch-run.arvc1-dz642-w0fp51xhnku51sp.827145214/keep2226321640]
2023-01-18T21:15:14.869084599Z crunch-run Fetching Docker image from collection '37bfa630ed9e843f553318a492437c56+219'
2023-01-18T21:15:14.950446087Z crunch-run Using Docker image id "sha256:4abeb6eed624276b385dad208ac4f6a831be161d36f5f2c35906a16f82344658"
2023-01-18T21:15:14.950546808Z crunch-run Loading Docker image from keep
2023-01-18T21:15:16.156836821Z crunch-run Starting container
2023-01-18T21:15:16.171152601Z crunch-run Waiting for container to finish
2023-01-18T21:15:19.323829274Z stderr FATAL: exec /entrypoint failed: fork/exec /entrypoint: argument list too long
2023-01-18T21:15:19.411908844Z crunch-run Container exited with status code 255 (signal -1)
2023-01-18T21:15:19.692933814Z crunch-run CompleteESC[0m

The `argument list too long` error message is misleading as the executable takes just 5 arguments, and is independent of the number of files or collections referenced in the input JSON file.

Further investigation revealed that this error is actually caused by an environment variable that is set to a very large value, making it impossible for any subsequent process to start. In this case the environment variable is `SINGULARITY_BIND`, injected by the singularity runtime into each container it spawns, and populated with a list of all the bind mounts attached to the container.

In our case, with our ever-growing number of inputs, we are currently dealing with +/- 850 unique collections that need to be mounted into its Singularity container, which appears to trigger this problem.

For now we could work around this problem without too much effort, but we expect that we will soon hit situations where workarounds will be cumbersome.

One possible solution could be at the crunch-runner side: instead of bind-mounting each collection separately into the singularity container, it should perhaps be possible to do just one mount with the parent folder containing all collections?


Related issues

Related to Arvados - Bug #18765: engine configuration too big > 1048448 with singularityNewActions
Actions #1

Updated by Brett Smith over 1 year ago

  • Related to Bug #18765: engine configuration too big > 1048448 with singularity added
Actions #2

Updated by Brett Smith over 1 year ago

I believe that the specific limit the job is hitting here is that the value of SINGULARITY_BIND exceeds the allowed length of a single environment variable, causing execve to return E2BIG and the reported "Argument list too long" error message. Quoting the execve man page:

… the limit per string is 32 pages (the kernel constant MAX_ARG_STRLEN)…

Most Linux systems use a 4KiB page size, so this means environment variable values passed to execve can be at most 128KiB. Since SINGULARITY_BIND is a single string documenting all bind mounts (source), it makes sense that this environment variable would hit this limit before we hit the limit of passing a bunch of --bind options to singularity exec as individual strings.

All this means it's related to #18765, but not actually the same bug.

It might be possible to work around this by specifying your own SINGULARITY_BIND environment variable in the workflow step/container request, assuming your code doesn't need it. I have not completely confirmed this, but from skimming the source, it looks like Singularity sets user-defined environment variables after its own, and can overwrite them as you would expect. If I'm right, then Singularity would still build the huge SINGULARITY_BIND value, but then overwrite it with your own, so the too-big value would never be sent to execve where the size limit is checked and enforced.

If that does in fact work, we could maybe look at giving administrators an option to easily do this for all containers?

Actions #3

Updated by Tom Schoonjans about 1 year ago

We can now confirm that your proposed workaround works, by adding our own `SINGULARITY_BIND` variable to the `EnvVarRequirement` block of the affected steps.

Many thanks!

Actions #4

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #5

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF