Project

General

Profile

Actions

Bug #12606

closed

Symlink in output points to invalid location -- no such file or directory

Added by Brad Chapman over 6 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
-
Story points:
-

Description

Hi all;
I'm running into an issue on runs with symlinked outputs, specifically for strelka2 variant calling. The run finishes cleanly and then fails when uploading results:

017-11-17T11:31:37.852090800Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-ploidy.vcf.gz (403 bytes) 2017-11-17T11:31:37.854757100Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-ploidy.vcf.gz.tbi (385 bytes) 2017-11-17T11:31:37.854791000Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-regions-merged.bed (285572 bytes) 2017-11-17T11:31:37.854875100Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-regions-merged.bed.gz (94913 bytes) 2017-11-17T11:31:37.854917900Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-regions-merged.bed.gz.tbi (13964 bytes) 2017-11-17T11:31:37.854941700Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-regions.bed (285572 bytes) 2017-11-17T11:31:37.855083700Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/stats/runStats.tsv (170 bytes) 2017-11-17T11:31:37.856529200Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/stats/runStats.xml (451 bytes) 2017-11-17T11:31:37.856914000Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/genome.S1.vcf.gz (24603795 bytes) 2017-11-17T11:31:37.868911500Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/genome.S1.vcf.gz.tbi (34744 bytes) 2017-11-17T11:31:37.868975600Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/genome.vcf.gz (24603795 bytes) 2017-11-17T11:31:37.878724000Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/genome.vcf.gz.tbi (34744 bytes) 2017-11-17T11:31:37.878796700Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/variants.vcf.gz (4136633 bytes) 2017-11-17T11:31:37.880521500Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/results/variants/variants.vcf.gz.tbi (59036 bytes) 2017-11-17T11:31:37.880587900Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/runWorkflow.py (7795 bytes) 2017-11-17T11:31:37.882003000Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/runWorkflow.py.config.pickle (4785 bytes) 2017-11-17T11:31:37.882031100Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/workflow.error.log.txt (0 bytes) 2017-11-17T11:31:37.882054800Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/workflow.exitcode.txt (2 bytes) 2017-11-17T11:31:37.882076800Z Uploading strelka2/chr19/NA24385-chr19_0_16089283-block-work/workflow.warning.log.txt (0 bytes) 2017-11-17T11:31:37.882133000Z While uploading output files: Symlink in output "/strelka2/chr19/NA24385-chr19_0_16089283-block.vcf.gz" points to invalid location "genome.S1.vcf.gz": lstat /tmp/603101599/strelka2/chr19/genome.S1.vcf.gz: no such file or directory 2017-11-17T11:31:37.882144900Z Cancelled

This is an example project and run that show the problem:

https://cloud.curoverse.com/container_requests/qr1hi-xvhdp-j10b0yribnprohf
https://cloud.curoverse.com/container_requests/qr1hi-xvhdp-ewo8ck7owxbuyud#Log

It appears to move files, then fail when a symlink that points to those files is uploaded later. Here is what the directory structure for one of those strelka2 runs looks like:

├── Test1-chrM_0_1000-block-ploidy.vcf.gz
├── Test1-chrM_0_1000-block-ploidy.vcf.gz.tbi
├── Test1-chrM_0_1000-block-regions.bed
├── Test1-chrM_0_1000-block-regions-merged.bed
├── Test1-chrM_0_1000-block-regions-merged.bed.gz
├── Test1-chrM_0_1000-block-regions-merged.bed.gz.tbi
├── Test1-chrM_0_1000-block.vcf.gz -> Test1-chrM_0_1000-block-work/results/variants/genome.vcf.gz
├── Test1-chrM_0_1000-block.vcf.gz.tbi -> Test1-chrM_0_1000-block-work/results/variants/genome.vcf.gz.tbi
└── Test1-chrM_0_1000-block-work
    ├── results
    │   ├── stats
    │   │   ├── runStats.tsv
    │   │   └── runStats.xml
    │   └── variants
    │       ├── genome.S1.vcf.gz
    │       ├── genome.S1.vcf.gz.tbi
    │       ├── genome.vcf.gz -> genome.S1.vcf.gz
    │       ├── genome.vcf.gz.tbi -> genome.S1.vcf.gz.tbi
    │       ├── variants.vcf.gz
    │       └── variants.vcf.gz.tbi
    ├── runWorkflow.py
    ├── runWorkflow.py.config.pickle
    ├── workflow.error.log.txt
    ├── workflow.exitcode.txt
    └── workflow.warning.log.txt

Thanks for any tips or clues on how to work around or fix.


Subtasks 2 (1 open1 closed)

Task #12637: ReviewClosedLucas Di Pentima11/17/2017Actions
Task #12609: DiagnoseNewPeter Amstutz11/17/2017Actions

Related issues

Related to Arvados - Bug #12183: [crunch-run] Handle symlinks with absolute paths into output directoryResolvedPeter Amstutz09/28/2017Actions
Related to Arvados - Bug #13100: [crunch-run] Replace custom manifest-writing code with collectionFSResolvedTom Clegg03/15/2018Actions
Actions

Also available in: Atom PDF