Project

General

Profile

Feature #20995

Updated by Peter Amstutz 3 months ago

When reading multiple small files in a collection through keep-web, anticipate that the client may read other files in the collection directory adjacent in the data stream to the file being accessed, and perform parallel prefetch on those files as well. 

 My thinking is that if a given stream is less than X megabytes (256? 512? 1024? configurable?) then keep-web would start a parallel prefetch of the entire stream.    In particular if there are lots of small blocks we want to have a bunch of prefetch requests in flight. 

 One thought is to use similar logic to large file block prefetch where we start from the first read and just read ahead in the stream, except we ignore file boundaries.    If prefetch would take us past the end of the end of the stream, we wrap around and start reading at the beginning. 

 The controlling assumptions here are (a) we have a lot of fast cache where we can dump our blocks (b) the stream ordering is at least vaguely similar to the order of data access 

Back