Project

General

Profile

Actions

Idea #6380

closed

Reading a new collection from keep takes extra time

Added by Bryan Cosca almost 9 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
06/22/2015
Due date:
Story points:
-

Description

When testing out how samtools merge looks at files, I ran some tests here:

#!/bin/bash

echo starting local
time ./samtools merge 22.bam *22.bam
rm 22.bam
echo starting arv-mount
time ~/keep/by_id/0b5dd5ad3fd555dbb9ef81a027b69dec+18147/samtools merge 22.bam *22.bam
rm 22.bam
echo starting read-keep
time ~/keep/by_id/0b5dd5ad3fd555dbb9ef81a027b69dec+18147/samtools merge 22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xaa.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xab.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xac.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xad.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xae.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xaf.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xag.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xah.22.bam \
~/keep/by_id/ff037b7792b5f287b8553db679714717+185949/xai.22.bam
rm 22.bam
echo starting new-coll
time ~/keep/by_id/0b5dd5ad3fd555dbb9ef81a027b69dec+18147/samtools merge 22.bam \
~/keep/by_id/84585b846972161cd8b106226bc1ba0a+817/*

... and got these results.

starting local

real 0m22.562s
user 0m20.788s
sys 0m0.416s
starting arv-mount

real 0m22.754s
user 0m20.796s
sys 0m0.380s
starting read-keep

real 0m22.560s
user 0m20.580s
sys 0m0.416s
starting new-coll

real 2m35.678s
user 0m25.852s
sys 0m1.392s

here, 84585b846972161cd8b106226bc1ba0a+817 is a new collection I created using workbench. ff037b7792b5f287b8553db679714717+185949 is a collection previously accessed. (All commands are calling the same files)

I have concerns that when running a job with a docker image that has never accessed a collection, the time to load that collection will scale with the amount of files in that collection.

Actions #1

Updated by Peter Amstutz over 4 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF