Idea #7995
closed[Documentation] Document Keep Balance setup in the Install Guide
Added by Brett Smith about 9 years ago. Updated almost 8 years ago.
Description
It should be as complete as any other page in the install guide. The only caveats are:
- It should come with a huge unmissable disclaimer at the top that Keep Balance is still being tested.
- There are only two cases where we think it might be safe:
- All your Keepstores are backed by their own POSIX filesystem(s)
- All your Keepstores are backed by shared object storage, one of which has a special service_type, and Data Manager talks to that one alone through its corresponding service_type switch
- It should not be linked from the TOC. Enough people want it that we want a single reference to give to interested deployers, but we don't want to generally advertise it.
Functional requirements:
- Document how to do a dry run/log-only run first, then how to switch that to actually deleting blocks once you're satisfied with the result.
This is how the datamanager token is generated:
Updated by Brett Smith about 9 years ago
- Description updated (diff)
- Category set to Documentation
Updated by Brett Smith about 9 years ago
- Target version set to Arvados Future Sprints
Updated by Brett Smith about 9 years ago
- Description updated (diff)
- Story points set to 1.0
Updated by Tom Morris about 8 years ago
- Subject changed from [Documentation] Document Data Manager setup in the Install Guide to [Documentation] Document Keep Balance setup in the Install Guide
- Description updated (diff)
Updated by Tom Morris almost 8 years ago
- Target version changed from Arvados Future Sprints to 2017-03-01 sprint
Updated by Tom Clegg almost 8 years ago
7995-keep-balance-docs @ 14304c7af0b0dd7bc9345b6c5aeb61a3bdc1d3b0
Updated by Tom Morris almost 8 years ago
I made a few copy edits and pushed them to the branch. Please review them to make sure that things are still technically correct.
I didn't run linkchecker due to Python dependency issues that I couldn't be bothered to sort out.
In addition, I have the following questions/comments:
- "privileged token" is inconsistent with the name of the script "create_superuser_token"
- Creating the privileged token doesn't seem to include a name or description which can be traced back to its use as a keep-balance token. Is there a way to include some identifying information so that we know which tokens are used for what?
- What is the default setting for delete in keep stores? The implication of the "Enable delete" section is that it's disabled by default, but that's never explicitly mentioned.
- I think we should pick one preferred way of enabling delete and recommend that. Both options (along with there priority ordering for overriding each other) can be documented in the keep-balance reference page (which I can't seem to find)
Bonus semi-related comment:
- Installing keep-store page talks about setting up local file system backed storage and has a separate page for Azure blob, but S3/GCP S3 blob is not documented anywhere that I can find.
Updated by Tom Clegg almost 8 years ago
Tom Morris wrote:
I made a few copy edits and pushed them to the branch. Please review them to make sure that things are still technically correct.
LGTM thanks
- "privileged token" is inconsistent with the name of the script "create_superuser_token"
Updated (here and in the crunch2 dispatch page I copied it from)
- Creating the privileged token doesn't seem to include a name or description which can be traced back to its use as a keep-balance token. Is there a way to include some identifying information so that we know which tokens are used for what?
We don't have that yet (but it does sound like a good idea)
- What is the default setting for delete in keep stores? The implication of the "Enable delete" section is that it's disabled by default, but that's never explicitly mentioned.
Default is disabled -- added a note to that effect.
- I think we should pick one preferred way of enabling delete and recommend that. Both options (along with there priority ordering for overriding each other) can be documented in the keep-balance reference page (which I can't seem to find)
YAML is the future but the keepstore install page still tells you to use command line flags, so I commented out the YAML option for now.
So our related-todo list is- Installing keep-store page talks about setting up local file system backed storage and has a separate page for Azure blob, but S3/GCP S3 blob is not documented anywhere that I can find.
- Name/label option for "create superuser token" script (also, shorthand for scopes, like "keep-balance")
- Update keepstore (and keep-balance) docs to configure keepstore with YAML instead of command line flags
- Document keepstore S3 volumes
Updated by Javier Bértoli almost 8 years ago
Tom Morris,
- From this text:
Keep-balance can be installed anywhere with network access to Keep services. Typically it runs on the same host as keepproxy.
Keepproxy is optional, as I understand it. If so, can I have more than one keep-balance, installed in 1+ keepstores?
- From this text:
Keep-balance deletes unreferenced and overreplicated blocks from Keep servers, makes additional copies of underreplicated blocks, and moves blocks into optimal locations as needed (e.g. after adding new servers).
I understand that keep-balance performs three operations:
1. deletes unreferenced and overreplicated blocks from Keep servers,
2. makes additional copies of underreplicated blocks, and
3. moves blocks into optimal locations as needed
But for this text:
If you are installing keep-balance on an existing system with valuable data, you can run keep-balance in "dry run" mode first and review its logs as a precaution. To do this, use the
keepstore -never-delete=true
flag or remove the-commit-trash
flag from your keep-balance startup script.
and this snippet:
~$ <span class="userinput">printf '#!/bin/sh\nexec keep-balance -commit-pulls -commit-trash 2>&1\n' | sudo tee run</span>
I understand that -never-delete=true will prevent the FIRST of those actions but nothing makes me assume it will prevent the other two. -commit-trash, which sounds like a completely different parameter from the runit example, and I suspect I'd need to disable the three parameters independently to have a REAL dry run:
- never-delete=true
- commit-trash=false
- commit-pulls=false
Am I right?
Perhaps we need to:
- Go with the established de-facto names for this operation: --dry-run, -n or -noop (a new ticket surely?).
- If this is not a priority now, I'd make it extra-clear in the documentation that these parameters will prevent the THREE operations or which of them will be REALLY affected, or which is the one that will perform a real dry run.
Updated by Tom Clegg almost 8 years ago
Updated the dry run instructions:
To do this, edit your keep-balance startup script to use the flags -commit-pulls=false -commit-trash=false
.
Updated by Tom Clegg almost 8 years ago
Javier Bértoli wrote:
Keepproxy is optional, as I understand it. If so, can I have more than one keep-balance, installed in 1+ keepstores?
Yes, it's possible to run many things (but not the Workbench uploader) without keepproxy.
Added a bold paragraph: A cluster should have only one keep-balance process running at a time.
(Does that answer the question?)
Updated by Javier Bértoli almost 8 years ago
Tom Clegg wrote:
Updated the dry run instructions:
To do this, edit your keep-balance startup script to use the flags
-commit-pulls=false -commit-trash=false
.
I notice you added these two flags and removed -never-delete=true. Is that correct, or just missed adding it?
Updated by Tom Clegg almost 8 years ago
Javier Bértoli wrote:
I notice you added these two flags and removed -never-delete=true. Is that correct, or just missed adding it?
That's correct.
keep-balance -commit-pulls=false -commit-trash=false
means go through the motions but don't tell the keepstore nodes to delete any blocks (or make any additional copies).
keepstore -never-delete=true
means ignore keep-balance if it tells keepstore to delete any blocks.
Updated by Tom Clegg almost 8 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:1e6a756a10a1c0a77aeea5041844ba3a572bdd70.