Project

General

Profile

Actions

Feature #20995

closed

Prefetch small files when scanning a collection directory

Added by Peter Amstutz 7 months ago. Updated 2 months ago.

Status:
Duplicate
Priority:
Normal
Assigned To:
Category:
Keep
Story points:
-

Description

When reading multiple small files in a collection through keep-web, anticipate that the client may read other files in the collection directory adjacent in the data stream to the file being accessed, and perform parallel prefetch on those files as well.

My thinking is that if a given stream is less than X megabytes (256? 512? 1024? configurable?) then keep-web would start a parallel prefetch of the entire stream. In particular if there are lots of small blocks we want to have a bunch of prefetch requests in flight.

One thought is to use similar logic to large file block prefetch where we start from the first read and just read ahead in the stream, except we ignore file boundaries. If prefetch would take us past the end of the end of the stream, we wrap around and start reading at the beginning.

The controlling assumptions here are (a) we have a lot of fast cache where we can dump our blocks (b) the stream ordering is at least vaguely similar to the order of data access


Related issues

Related to Arvados Epics - Idea #18342: Keep performance optimizationNew08/01/202305/30/2024Actions
Related to Arvados - Feature #18961: Go FileSystem / FUSE mount supports block prefetchClosedTom CleggActions
Actions #1

Updated by Peter Amstutz 7 months ago

  • Related to Idea #18342: Keep performance optimization added
Actions #2

Updated by Peter Amstutz 7 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 7 months ago

  • Target version changed from Future to Development 2023-11-08 sprint
Actions #4

Updated by Peter Amstutz 7 months ago

  • Target version changed from Development 2023-11-08 sprint to Development 2023-10-25 sprint
Actions #5

Updated by Peter Amstutz 7 months ago

  • Subject changed from Prefetch small file blocks when scanning a collection to Prefetch small files when scanning a collection directory
Actions #6

Updated by Peter Amstutz 7 months ago

  • Target version changed from Development 2023-10-25 sprint to Development 2023-11-08 sprint
Actions #7

Updated by Peter Amstutz 6 months ago

  • Target version changed from Development 2023-11-08 sprint to Development 2023-11-29 sprint
Actions #8

Updated by Peter Amstutz 6 months ago

  • Target version changed from Development 2023-11-29 sprint to Development 2024-01-03 sprint
Actions #9

Updated by Peter Amstutz 5 months ago

  • Target version changed from Development 2024-01-03 sprint to Development 2024-01-17 sprint
Actions #10

Updated by Peter Amstutz 5 months ago

  • Target version changed from Development 2024-01-17 sprint to Development 2024-01-31 sprint
Actions #11

Updated by Peter Amstutz 5 months ago

  • Target version changed from Development 2024-01-31 sprint to Development 2024-02-14 sprint
Actions #12

Updated by Peter Amstutz 3 months ago

  • Target version changed from Development 2024-02-14 sprint to Development 2024-02-28 sprint
Actions #13

Updated by Peter Amstutz 3 months ago

  • Target version changed from Development 2024-02-28 sprint to Development 2024-03-13 sprint
Actions #14

Updated by Peter Amstutz 3 months ago

  • Related to Feature #18961: Go FileSystem / FUSE mount supports block prefetch added
Actions #15

Updated by Peter Amstutz 3 months ago

  • Description updated (diff)
Actions #16

Updated by Tom Clegg 2 months ago

  • Assigned To set to Tom Clegg
  • Status changed from New to Duplicate

absorbed into #18961

Actions #17

Updated by Peter Amstutz 2 months ago

  • Target version changed from Development 2024-03-13 sprint to Development 2024-02-28 sprint
Actions

Also available in: Atom PDF