Project

General

Profile

Feature #8409

Updated by Tom Clegg about 8 years ago

arv-mount logs #bytes read from it via the FUSE filesystem (blkio:0:0 read), as well as #bytes it received from keepstore in order to service those reads (net:keep0 rx). This is just my (TC) best guess so far about how to estimate cache effectiveness -- other suggestions are welcome! 

 Ideally, any time arv-mount retrieves a block from Keep, it's 100% useful: a user process reads every byte of the data, and that same block never gets re-fetched after expiring from the cache. In crunchstat terms, this would yield a net:keep0÷blkio:0:0 ratio of 1. 

 In practice, some workflows don't get anywhere near ideal: for example, if arv-mount uses a cache big enough to hold 4 blocks (which is the default), an 8-way merge sort can easily generate a sequence of 4KiB read operations that each cause a 64 MiB block to be ejected from the cache, and a new 64 MiB block (recently ejected from the cache itself) to be re-fetched. In crunchstat terms, this would yield a net:keep0÷blkio:0:0 ratio of 16384. 

 crunchstat-summary should compute this ratio -- perhaps just once per task, using the last sample -- and emit a warning/suggestion to increase keep_cache_mb_per_task if the ratio is too high. 

 The definition of "too high" is TBD. Suggest we _always_ print the ratio, and start learning what the "normal" range looks like; we can tweak the warning threshold as we go. 

Back