Story #3640

[SDKs] Add runtime option to SDKs (esp Python and arv-mount) to use a filesystem directory block cache as an alternative to RAM cache.

Added by Tom Clegg almost 5 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
2.0

Description

Background:

arv-mount has a block cache, which improves performance when the same blocks are read multiple times. However:
  • Currently a new arv-mount process is started for each Crunch task execution. This means tasks don't share a cache, even if they're running at the same time.
  • In the common case where multiple crunch tasks run at the same time and use the same data, we have multiple arv-mount processes each retrieving and caching its own copy of the same data blocks.
Proposed improvement:
  • Use large swap on worker nodes (preferably SSD). (We already do this for other reasons.)
  • Set up a large tmpfs on worker nodes and use it as crunch job scratch space. (This already gets cleared at the beginning of a job to avoid leakage between jobs/users.)
  • Use a directory in that tmpfs as an arv-mount cache. This makes it feasible to use a large cache size, and makes it easy to share the cache between multiple arv-mount processes.
Implementation notes:
  • Rely on unix permissions for cache privacy. (Warn if the cache dir's mode & 0007 != 0, but go ahead anyway: there will be cases where that would be useful and not dangerous.)
  • Use flock() to avoid races and duplicated effort. (If arv-mount 1 is writing a block to the cache, then arv-mount 2 should wait for arv-mount 1 to finish then read from the cache, rather than fetch its own copy.)
  • Do not clean up cache dir at start/exit, at least by default (the general idea is to share with past/future arv-mount procs). An optional --cache-clear-atexit flag would be nice to have.
  • Measuring/limiting cache size could be interesting
  • Delete & replace upon finding a corrupt/truncated cache entry
Integration:
  • The default Keep mount on shell nodes should use a filesystem cache, assuming there is an appropriate filesystem for it (i.e., something faster than network: tmpfs, SSD, or at least a disk with async/barriers=0).
  • crunch-job should create a per-job temp dir on each node during the "install" phase, and point all arv-mount processes to it.

Related issues

Related to Arvados - Feature #6310: [FUSE] Support scaling the internal block cache based on number of open filesNew

Related to Arvados - Story #6311: [Maybe] [SDKs] Support caching Keep blocks in memcachedRejected

Related to Arvados - Feature #8228: [SDKs] [FUSE] Python SDK and arv-mount use Range requests when a caller requests part of a block that has been ejected from the cacheNew01/19/2016

Has duplicate Arvados - Story #10510: Allow Keep client to cache blocks to diskDuplicate11/10/2016

History

#1 Updated by Tom Clegg almost 5 years ago

  • Description updated (diff)
  • Category set to Keep

#2 Updated by Tom Clegg almost 5 years ago

  • Target version set to Arvados Future Sprints

#3 Updated by Tom Clegg almost 5 years ago

  • Subject changed from [FUSE] Add runtime option to use a filesystem directory block cache as an alternative to RAM cache. to [FUSE] Add runtime option to arv-mount to use a filesystem directory block cache as an alternative to RAM cache.

#4 Updated by Tom Clegg almost 5 years ago

  • Subject changed from [FUSE] Add runtime option to arv-mount to use a filesystem directory block cache as an alternative to RAM cache. to [FUSE] Add runtime option to SDKs (esp Python and arv-mount) to use a filesystem directory block cache as an alternative to RAM cache.

#5 Updated by Tom Clegg almost 5 years ago

  • Subject changed from [FUSE] Add runtime option to SDKs (esp Python and arv-mount) to use a filesystem directory block cache as an alternative to RAM cache. to [SDKs] Add runtime option to SDKs (esp Python and arv-mount) to use a filesystem directory block cache as an alternative to RAM cache.

#6 Updated by Tom Clegg about 4 years ago

  • Description updated (diff)

Also available in: Atom PDF