Project

General

Profile

Actions

Feature #19428

closed

keep-web/collectionfs/sitefs performance improvements

Added by Tom Clegg about 1 month ago. Updated 13 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
09/02/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Address performance issues in keep-web, particularly in sequences of S3 requests for a single collection using a single token, which should be an ideal scenario for the session cache.


Subtasks 1 (0 open1 closed)

Task #19448: Review 19428-webdav-performanceResolvedPeter Amstutz09/02/2022

Actions
Actions #2

Updated by Tom Clegg about 1 month ago

Production log files indicate a bimodal timing distribution: most requests have timeToStatus around 20 ms, but some are around 1-2 seconds even though they don't make any calls out to controller.

I suspect this is caused by pruneSessions's call to fs.MemorySize() racing with the filesystem accesses that are needed to fulfil the request. When pruneSessions wins the race, fs.MemorySize() blocks all other filesystem accesses while it traverses the entire filesystem (all projects, collections, files, and data segments). Since a GET / HEAD request does multiple filesystem operations, it can even be interrupted this way more than once. Ironically this is more likely to happen when there are fewer active sessions.

We might want to make pruneSessions less aggressive (e.g., max 1x per 10 s) but the most important change is to make MemorySize() non-blocking.

This branch also fixes a few less significant delays.

19428-webdav-performance @ 2c0638b0444652591fa18d6cae2d1977ee5e5731 -- developer-run-tests: #3280
  • make fs.MemorySize() non-blocking
  • eliminate a leading "/" in an API call that was causing a 301 redirect on each user lookup
  • ask for a larger page size when populating a project directory (the per-page overhead for group#contents can be significant)
Actions #3

Updated by Peter Amstutz about 1 month ago

  • Category set to Keep
Actions #4

Updated by Tom Clegg about 1 month ago

  • Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Actions #5

Updated by Tom Clegg about 1 month ago

User tested the note-2 patch in affected production environment, reported huge improvement.

19428-webdav-performance @ 193ebedab37c71170f649308732fe0a18d7d2ba6 -- developer-run-tests: #3283

wb1 retry developer-run-tests-apps-workbench-integration: #3531

  • also avoids computing each session's MemorySize twice (won't really affect user-facing performance, but will reduce CPU load on the server host by a bit)
Actions #6

Updated by Peter Amstutz 18 days ago

  • Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Actions #7

Updated by Peter Amstutz 13 days ago

This LGTM, please merge & cherry pick on to 2.4.3

Actions #8

Updated by Tom Clegg 13 days ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF