Project

General

Profile

Idea #6309

Updated by Brett Smith over 8 years ago

For jobs that access many files simultaneously, FUSE's default block cache is usually not sufficient, and will thrash regularly.    This can lead to blocks downloaded multiple times, and degraded performance. 

 If there's evidence that it will substantially benefit production pipelines, add a runtime_constraint to jobs so they can specify an argument for the @--file-cache@ option of their FUSE mount. 

 h2. Implementation 

 * When If the job specifies a @keep_cache_mb_per_task@ runtime constraint, crunch-job calls arv-mount with that value in performance benefit isn't big enough, we don't want to do this, for all the @--file-cache@ switch (converting units as needed). 
 * Document usual reasons we don't want to avoid more development on our current JSON pipeline templates and Crunch. 

 Bryan will run some benchmarks and report on how helpful this new runtime constraint in the Jobs schema API reference. can be.

Back