Project

General

Profile

Actions

Feature #18992

closed

Enable local keepstore on slurm/lsf if cluster config file already exists on compute node

Added by Tom Clegg over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Description

Currently crunch-run only brings up a local keepstore process if it receives the cluster config on stdin, which currently only happens under arvados-dispatch-cloud. When using slurm or lsf, it is not enabled, and there is no error/warning saying why.

Proposed improvements:
  • If cluster config is not supplied on stdin, try reading it from /etc/arvados/config.yml (or a different value specified on the command line via CrunchRunArgumentsList config)
  • If local keepstore is enabled in config (LocalKeepBlobBuffersPerVCPU>0) but can't be brought up because cluster config file does not exist or cannot be read, log a message to that effect, and proceed using the usual keepstores
  • always log where it got the config from (stdin, somewhere on the file system, or didn't find it, so won't try to use it).
  • Explain in config.default.yml comments (and in upgrade notes) that the sysadmin is responsible for deploying the cluster config file to the compute nodes in order to use this feature with slurm or lsf

Files

crunch-run (22.9 MB) crunch-run arvados-server @ fb181ba27fd354e596d2216786ccee9a537bd0a3 Tom Clegg, 04/14/2022 06:18 PM
crunch-run (10.7 MB) crunch-run f6e8752348958e3bb48c7509a4ff78689f2d64c9 (size reduced with upx) Tom Clegg, 04/15/2022 07:35 PM

Subtasks 1 (0 open1 closed)

Task #19003: Review 18992-hpc-local-keepstoreResolvedPeter Amstutz04/14/2022Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Feature #16347: crunch-run runs local keepstoreResolvedTom Clegg10/08/2021Actions
Actions

Also available in: Atom PDF