[Maybe] [SDKs] Support caching Keep blocks in memcached
We could potentially improve job performance by running memcached on each compute node to store Keep blocks. When a node is running many tasks from a job that access the same data, this cache could make it possible for the block to be downloaded to the node once, then shared across tasks.
If we decide to go ahead with this caching strategy, add the necessary support to the Python SDK Keepclient to use a memcached store when available.
#2 Updated by Tom Clegg over 3 years ago
- Orchestrating turning up/down memcached when jobs start and stop.
- Firewalling user/job A's memcached from user/job B's memcached. E.g., crunch2 allowing >1 job per node, shared shell VM.
Memcached is good for sharing free memory (freely!) across nodes. Given that each job has distinct permissions, we'd essentially need a VPN per job in order to take advantage of that feature. And without that feature, I'm not sure memcached would perform any better than a tmpfs-backed filesystem cache.