Feature #17749

Updated by Ward Vandewege 5 months ago

AWS has a hard request limit of 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second *per prefix* in an Amazon S3 bucket, cf. https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/.

Prefixes are defined as follows (cf. https://aws.amazon.com/premiumsupport/knowledge-center/s3-prefix-nested-folders-difference/):

A prefix is the complete path in front of the object name, which includes the bucket name. For example, if an object (123.txt) is stored as BucketName/Project/WordFiles/123.txt, the prefix is “BucketName/Project/WordFiles/”. If the 123.txt file is saved in a bucket without a specified path, the prefix value is "BucketName/".

Keep currently does not store its blocks in subdirectories in the S3 buckets it uses. That means the prefix value for all blocks in a particular bucket is "BucketName/", and is subject to the request limits per bucket.

At some point, we may run into the request limits, particularly in a situation where one S3 bucket is shared along many keepstores, e.g. after #16516 is implemented.

The fix would be to use more prefixes in each S3 bucket, perhaps adopting the same pattern keepstore uses when backed by POSIX filesystems.