Feature #19428


keep-web/collectionfs/sitefs performance improvements

Added by Tom Clegg 5 months ago. Updated 4 months ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
(Total: 0.00 h)
Story points:
Release relationship:


Address performance issues in keep-web, particularly in sequences of S3 requests for a single collection using a single token, which should be an ideal scenario for the session cache.

Subtasks 1 (0 open1 closed)

Task #19448: Review 19428-webdav-performanceResolvedPeter Amstutz09/02/2022

Actions #2

Updated by Tom Clegg 5 months ago

Production log files indicate a bimodal timing distribution: most requests have timeToStatus around 20 ms, but some are around 1-2 seconds even though they don't make any calls out to controller.

I suspect this is caused by pruneSessions's call to fs.MemorySize() racing with the filesystem accesses that are needed to fulfil the request. When pruneSessions wins the race, fs.MemorySize() blocks all other filesystem accesses while it traverses the entire filesystem (all projects, collections, files, and data segments). Since a GET / HEAD request does multiple filesystem operations, it can even be interrupted this way more than once. Ironically this is more likely to happen when there are fewer active sessions.

We might want to make pruneSessions less aggressive (e.g., max 1x per 10 s) but the most important change is to make MemorySize() non-blocking.

This branch also fixes a few less significant delays.

19428-webdav-performance @ 2c0638b0444652591fa18d6cae2d1977ee5e5731 -- developer-run-tests: #3280
  • make fs.MemorySize() non-blocking
  • eliminate a leading "/" in an API call that was causing a 301 redirect on each user lookup
  • ask for a larger page size when populating a project directory (the per-page overhead for group#contents can be significant)
Actions #3

Updated by Peter Amstutz 5 months ago

  • Category set to Keep
Actions #4

Updated by Tom Clegg 5 months ago

  • Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Actions #5

Updated by Tom Clegg 5 months ago

User tested the note-2 patch in affected production environment, reported huge improvement.

19428-webdav-performance @ 193ebedab37c71170f649308732fe0a18d7d2ba6 -- developer-run-tests: #3283

wb1 retry developer-run-tests-apps-workbench-integration: #3531

  • also avoids computing each session's MemorySize twice (won't really affect user-facing performance, but will reduce CPU load on the server host by a bit)
Actions #6

Updated by Peter Amstutz 4 months ago

  • Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Actions #7

Updated by Peter Amstutz 4 months ago

This LGTM, please merge & cherry pick on to 2.4.3

Actions #8

Updated by Tom Clegg 4 months ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF