Peter Amstutz wrote:
nit: The purpose of this method seems to be to fill in mountsByClass
but I wouldn't use the word "cache", it is more like a "lookup table" (so suggesting setupCaches() → setupLookupTables()
Agreed, done.
- As a minor optimization, you could calculate the number of slots ahead of time, so that on each run of "balanceBlock" the slots can be allocated once with correct capacity to avoid any reallocation.
Done.
- I see that a newly written block can't be trashed until the TTL expires. I also see that it preferentially pulls blocks to slots with the desired storage class. This means blocks written to "default" but intended for a different storage class will tend to be over-replicated for some period of time, which is counter-productive if the goal of storage classes is cost savings.
"Everything is written to default, and later moved to cheaper storage" is the first use case we want to address, and in that case temporary over-replication is inevitable. Aside from that, no amount of client-side support would have been particularly useful without keep-balance support anyway -- keep-balance would just end up unbalancing whatever the client managed to achieve.
Maybe the sorting should always prefer read-only slots?
It does prefer read-only slots if they satisfy the requirements. Do you mean it should prefer read-only slots that don't satisfy the requirements? That would be an optimization if there was a strict ordering of classes. But I don't think it would work if you had classes like "more-private" and "more-durable": "you asked me to put a copy in more-durable, but we already have a copy in a readonly more-private volume, so I didn't bother making a more-durable copy".
- We discussed that a collection can specify multiple storage classes, in which case the block is supposed to go only to mounts that satisfy all the storage classes. As a simplifying assumption we're only allowing one storage class. If I squint a little bit I think I see a way to handle multiple storage classes without having to totally overhaul the "distribute to slots" algorithm, but if you have any thoughts on that, it might make a nice TODO comment for whenever we need to implement that feature.
Added.
- I know this is not part of the ticket but I really hate the notation
0=have<want
, it is not common mathematical notation and it is important for admins to clearly understand what the report is saying, could we please print something less ambiguous like (0=have)<want
or have<want & have=0
(this could also be addressed with a keep-balance admin page).
Improving the report sounds great but I'm not sure that's a good place to start. Anyway, since this isn't part of the ticket, I think we should leave it alone for now.
- What's your sense of how this is all going to scale when it needs to balance ten million blocks?
Time is dominated by retrieving collections from API server, and I don't expect that to change here. A map[string]int is bigger than an int, so memory use will be higher. I've added a TODO for pooling Desired maps, so we can reduce memory per megablock if it becomes an issue.
(Aside: The ultimate scaling device is sharding the blocks: balance [0-7].*
, then [8-f].*
, etc. This makes each balancing cycle cost more CPU time, but scales to an arbitrary number of blocks with a given amount of memory per keep-balance machine.)
12708-balance-storage-classes @ 2957b917aaaefc1485e1c5293d413f0931b3030a