[FUSE] performance issues stat() in python
|Target version:||Arvados Future Sprints|
|Velocity based estimate||-|
we are seeing some performance issues accessing arv fuser mounts with python
from one of our users:
I noticed that - when mounting arvados keep - running os.stat is very slow.
I am running a workflow that that has 18 BAM files as input, each about 60Gb,
and a reference sequence of 3Gb.
The time it takes to build a workflow is 6 mins, almost all of this time is spent in calls to os.stat. To give an example:
- 2016-08-02 13:46:18,118 DEBUG DB: collected 0.368199110031 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxx.fa
- 2016-08-02 13:46:24,087 DEBUG DB: collected 5.96899294853 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxxxxxxxxxx.bam
- 2016-08-02 13:46:25,408 DEBUG DB: collected 1.32038497925 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxxxxxxxxxxx.fa
- 2016-08-02 13:46:37,359 DEBUG DB: collected 11.9504730701 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxxxxxxxxxxx.bam
- 2016-08-02 13:46:38,718 DEBUG DB: collected 1.35802483559 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxxxxxxxxxxx.fa
- 2016-08-02 13:46:50,784 DEBUG DB: collected 12.065251112 /local/scratch/benchmark/ctmpBa9g4h/shared/Open_Access/xxxxxxxxxxxxxxxxx.bam
Each log message contains a number and a filename. The number is the
time in seconds of the os.stat call to the filename. From these numbers I deduce:
1. The time for os.stat is proportional to the file size (compare bam versus fasta file)
2. The time for os.stat does not reduce (see repeated calls to fasta file)
This is on london. I am already caching within python all the os.stat calls.
It seems to me that stat() from the command line is not similarly affected.
The issue might very well be in the python posix module.
Do you - or the people at arvados - have any advice? I expect this could be a generic issue for anyone trying to build workflows from data in arvados keep and using file stamps.
As the user pointed out it could very well be an issue with python, though I thought it was worth checking with you guys,