Feature #16347

crunch-run runs local keepstore

Added by Peter Amstutz over 1 year ago. Updated 2 days ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
10/08/2021
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

  • Runs arvados-server (same binary as crunch-run) in a subprocess in keepstore when starting up crunch-runs
  • Config parameter which for (N * number of allocated CPU cores) number of keepstore buffers
  • Use environment variable ARVADOS_KEEP_SERVICES to override the keepstore server used by arv-mount and inside the container when {API: true}
  • Shut down keepstore service when shutting down crunch-run

Subtasks

Task #18229: Review 16347-container-local-keepstoreIn ProgressWard Vandewege


Related issues

Related to Arvados Epics - Story #16516: Run Keepstore on local compute nodesIn Progress10/01/202111/30/2021

History

#2 Updated by Peter Amstutz over 1 year ago

#4 Updated by Peter Amstutz 9 months ago

  • Related to Story #16516: Run Keepstore on local compute nodes added

#5 Updated by Peter Amstutz 9 months ago

#6 Updated by Peter Amstutz 9 months ago

  • Subject changed from run keepstore locally on compute node in standard configuration to crunch-run runs local keepstore

#7 Updated by Peter Amstutz 23 days ago

  • Target version set to 2021-10-13 sprint

#8 Updated by Peter Amstutz 23 days ago

  • Description updated (diff)

#9 Updated by Peter Amstutz 23 days ago

  • Category set to Keep

#10 Updated by Peter Amstutz 22 days ago

  • Assigned To set to Tom Clegg

#11 Updated by Tom Clegg 15 days ago

  • Status changed from New to In Progress

#12 Updated by Tom Clegg 15 days ago

So far:

16347-container-local-keepstore @ e2aaf1be932a49cdcfa140167ca296e39b6e5158 -- https://ci.arvados.org/view/Developer/job/developer-run-tests/2726/

  • new config Containers.LocalKeepBlobBuffersPerVCPU, default 0
  • tested successfully on ce8i5, see ce8i5-xvhdp-bwrxfm4znro10b0
  • adds keepstore.txt file to container log
  • fixes keepstore's Azure driver log spam (404 when checking whether a block already exists before writing it)
tbd
  • figure out a test case
  • decide how to spell keepstore.txt config options:
    • log everything (too verbose, unless debugging)
    • suppress "request" and "successful response" lines -- only log startup messages, error responses, backend errors, anything else that might get logged by third party libraries, etc.
    • discard all logs (more secure)
  • decide how to spell the config option to enable "send copy of all logs to systemd-cat" (currently always enabled if systemd-cat exists, which is not ideal because journald does a ridiculous amount of IO and the logs typically disappear via instance shutdown before anyone bothers to look at them anyway)

#13 Updated by Tom Clegg 13 days ago

16347-container-local-keepstore @ bda9093be4d24d45a6fff29148fbb5438e283897 -- https://ci.arvados.org/view/Developer/job/developer-run-tests/2729/

  • adds config for keepstore logging -- LocalKeepLogsToContainerLog: none/all/errors ("errors" isn't the perfect description of what it does, though. It includes startup messages like "starting keepstore on http://localhost:35471", "started volume zzzzz-nyw5e-000000000000001", etc. Which is wrong, the name or the behavior?)
  • adds test case

I think the logging-to-systemd question deserves its own issue. For now I've just made it skip the systemd-cat setup (instead of preventing anything from working at all) when systemd-cat is not in PATH.

#14 Updated by Peter Amstutz 8 days ago

  • Target version changed from 2021-10-13 sprint to 2021-10-27 sprint

#15 Updated by Ward Vandewege 6 days ago

Reviewing bda9093be4d24d45a6fff29148fbb5438e283897

In lib/config/config.default.yml

+      # This feature has security implications. (1) Container logs
+      # will include keepstore log files, which typically reveal some
+      # volume configuration details, error messages from the cloud
+      # storage provider, etc., which are not otherwise visible to
+      # users. 

This should probably be updated to indicate that the container logs are disabled by default?

It may also be worth noting here that if this feature is enabled, the communication with the local keepstore process will happen over a tcp connection on the localhost interface without ssl.

+      LocalKeepBlobBuffersPerVCPU: 0

I wonder if that is the correct default. It's backwards compatible but probably not what ideal in most scenarios.

I think we should also add a page to the Architecture section of our docs that describes this feature and why it is important. I can write that if you want, I suppose we can do a followup story for that.

No comments on the rest of the diff, it looks great. LGTM, thanks!

#16 Updated by Tom Clegg 5 days ago

I forgot to mention some limitations that need to be documented until we can fix them.
  • if all of your volumes have AccessViaHost, this won't work because the local keepstore process won't have any volumes. This is detected by config-check etc. as an invalid config.
  • if some of your volumes have AccessViaHost, those volumes will be inaccessible, which means some blocks in a container's inputs or docker image container might be inaccessible, causing containers to fail. We could address this by having keepstore fall back on proxying to other keepstores but that pretty much requires our future keep architecture.
  • if any volume has lower replication than the default collection replication level, writes will fail "insufficient replicas" when keepstore decides to write to those volumes (because keepstore returns after writing to a single volume even if the client wanted more replicas). We could address this by adding an "I'm the only keepstore" flag to keepstore that causes it to write multiple volumes instead.

Perhaps the 2nd and 3rd points should throw config-check warnings or errors?

Or, we could interpret the config as "...if possible" and just disable it automatically in these troublesome situations. That way the default could be non-zero.

I think we should also add a page to the Architecture section of our docs that describes this feature and why it is important.

Agreed

This should probably be updated to indicate that the container logs are disabled by default?

Oops, fixed.

16347-container-local-keepstore @ c8d252f51c23484484e4aa023fcd1f86ee961eab

#17 Updated by Ward Vandewege 3 days ago

Tom Clegg wrote:

I forgot to mention some limitations that need to be documented until we can fix them.
  • if all of your volumes have AccessViaHost, this won't work because the local keepstore process won't have any volumes. This is detected by config-check etc. as an invalid config.
  • if some of your volumes have AccessViaHost, those volumes will be inaccessible, which means some blocks in a container's inputs or docker image container might be inaccessible, causing containers to fail. We could address this by having keepstore fall back on proxying to other keepstores but that pretty much requires our future keep architecture.
  • if any volume has lower replication than the default collection replication level, writes will fail "insufficient replicas" when keepstore decides to write to those volumes (because keepstore returns after writing to a single volume even if the client wanted more replicas). We could address this by adding an "I'm the only keepstore" flag to keepstore that causes it to write multiple volumes instead.

Perhaps the 2nd and 3rd points should throw config-check warnings or errors?

Yeah the 2nd point should also be a config-check error.

Would the 3rd point be a problem in general? Not just when Containers.LocalKeepBlobBuffersPerVCPU is non-zero?

Or, we could interpret the config as "...if possible" and just disable it automatically in these troublesome situations. That way the default could be non-zero.

That would remove need for configuration, so that sounds good. Would there be a way for an admin to tell if a compute node used a local keepstore or not from the container (requeest) logs? We'd need that, and we would need to clarify how to do that in the docs.

I think we should also add a page to the Architecture section of our docs that describes this feature and why it is important.

Agreed

This should probably be updated to indicate that the container logs are disabled by default?

Oops, fixed.

16347-container-local-keepstore @ c8d252f51c23484484e4aa023fcd1f86ee961eab

Thanks!

#18 Updated by Tom Clegg 2 days ago

16347-container-local-keepstore @ 68259bcde57277cb709296fc24e86826d9c131d5 -- https://ci.arvados.org/view/Developer/job/developer-run-tests/2735/
  • (re-ran the "remainder" job which failed on a port# collision: https://ci.arvados.org/job/developer-run-tests-remainder/2843/)
  • default is LocalKeepBlobBuffersPerVCPU: 1
  • if LocalKeepBlobBuffersPerVCPU>0 but volume config is unsuitable, crunch-job logs a message to that effect and then proceeds as if LocalKeepBlobBuffersPerVCPU=0
  • reserves extra memory when choosing node type

Is the plain http point worth mentioning if it's only listening on localhost? It seems to me we do this in various places like nginx→controller so I wouldn't think it would surprise anyone.

How about something like this for the release notes:

When Arvados runs a container, the crunch-run supervisor process now brings up its own keepstore server to handle I/O for mounted collections, outputs, and logs. With the default configuration, the keepstore process allocates one 64 MiB block buffer per VCPU requested by the container. For most workloads this will increase throughput, reduce total network traffic, and make it possible to run more containers at once without provisioning additional keepstore nodes to handle the I/O load.
  • If you have containers that can effectively handle multiple I/O threads per VCPU, consider increasing the Containers.LocalKeepBlobBuffersPerVCPU value, or setting it to 0 to disable this feature.
  • This feature is enabled only if no volumes use AccessViaHosts, and no volumes have underlying Replication less than the default collection replication. If the feature is configured but cannot be enabled due to an incompatible volume configuration, this will be noted in the crunch-run.txt file in the container log.

Would the 3rd point be a problem in general? Not just when Containers.LocalKeepBlobBuffersPerVCPU is non-zero?

(the "insufficient replicas" thing) isn't a problem when a client can access multiple keepstore servers (client writes to multiple keepstore servers until reaching desired replication) or a single keepproxy server (keepproxy writes to multiple keepstore servers). So I think this would only come up elsewhere if you manually set ARVADOS_KEEP_SERVICES to a single keepstore server, or if you only provision one keepstore node.

Also available in: Atom PDF