Story #13925

Default keep cache scales with machine size

Added by Peter Amstutz 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

The default keep cache size is 256 MiB. For certain workloads, this is much too small. In particular, multithreaded workloads which read from multiple files experience severe cache contention. Unfortunately, it is difficult for users to analyze performance problems due to keep cache. Often times the response is simply to increase the machine size. However, because the keep cache does not scale with machine size, this does not have any effect.

Based on the observation that (a) users request multicore machines for multithreaded workloads and (b) users typical response to performance problems is to scale up the machine, we should scale the default keep cache based on machine size.

The cache should be either a percentage of RAM (say 12.5%) or multiplied by the number of cores, say 384 MiB per core.

This could be computed by a-c-r or on the API server.

History

#1 Updated by Peter Amstutz 5 months ago

  • Status changed from New to In Progress

#2 Updated by Peter Amstutz 5 months ago

  • Description updated (diff)
  • Status changed from In Progress to New

#3 Updated by Joshua Randall 5 months ago

When you say "machine size" do you actually mean the vcpus allocated to the job? On our system we use slurm with consumable resources and cgroup limits, so multiple jobs are run on each machine.

#4 Updated by Peter Amstutz 5 months ago

Joshua Randall wrote:

When you say "machine size" do you actually mean the vcpus allocated to the job? On our system we use slurm with consumable resources and cgroup limits, so multiple jobs are run on each machine.

Good point. Yes, I was thinking it would be calculated on vcpus / RAM allocated to the job, since we don't have an actual machine size at the point of making the container request. But really I wrote this on the assumption of cloud nodes, where we try to allocate the best fit VM for the job and only run one job at a time per VM.

The intention is to make it so that the intuition of giving more resources to a slow tool would have some effect. However, if that means the user ends up asking for more cores / RAM that are not actually going to be used, that is quite wasteful.

Another way to go about this might be to use the warning mechanism under development in #13773 to report cache thrashing, and have a streamlined way of retrying with a bigger cache.

Also available in: Atom PDF