Provide a mechanism to store "secrets" securely
There needs to be a way to store credentials and other secrets securely for use by crunch containers.
Requirements / use cases:
- At runtime, crunch container must be able to get credentials to access remote resources, for example data transfer to remote object storage
- Storing credentials easily visible places such as the container record, git repo, keep collection or docker image is unacceptable.
- Storing unencrypted credentials directly in the Arvados Postgres database is not desirable. Homebrew schemes for encrypting credentials are also not desirable.
- We refer to credentials using a symbolic name, look them up to get the credentials
- Because access of containers is based on Arvados user (+ special container token) access control to read credentials should also be based on Arvados user and/or container token.
- From slides:
- All 3rd Party credentials will be stored in Vault-like secure system. Structure of a vault is key=value,
- Where vale are credentials provided by 3rd Party and key is being generated by operations portal.
- Initial proposal for generating credentials key is:
Based on the proposal, the essential development tasks are:
- Create secrets table in API server
- Create Vault plugin that interacts enables login with Arvados API token and interacts with secrets table to determine policy granting access to secrets.
- Arvados client support to work with secrets (at minimum, a command line client for reading, writing, and listing secrets which interacts with the API server and Vault)
In order to integrate secrets handling into CWL, a couple of additional tasks are necessary
- arvados-cwl-runner feature to indicate inputs that represent "secrets" and adjust the container request accordingly.
- Crunch-run feature to access Vault and perform substitution of secret into config file or environment just-in-time, as part of container setup, prior to running container.
Updated by Peter Amstutz over 5 years ago
- Description updated (diff)
A CWL job needs to push some data from keep to an external service that needs credentials (i.e. aws bucket with aws_access_key_id aws_secret_access_key or sftp with username/password). Which means that we need to give those credentials to the job at the moment we make the transfer.
Options we discard:
1. put the credentials as an input to the job
2. add the credentials to the docker image that runs
3. bake the compute image with those credentials
The problem is the fact that the credentials can easily be obtained by others/other jobs.
It seems to me that for this problem, at a conceptual level, it would be ideal if the per-container arvados token (which is only valid for the duration of the container's lifecycle, right?) could be used to access vault and get read to the credentials needed for the container to do its job.
I assume that would mean a Vault plugin, so that
vault login arvados_token
would result in a callback to our API server to ensure the token is valid.
Vault would then also need a mechanism to determine what policy (aka, what secrets) the arvados_token has access to. Should our API server tell it that, too?
Other things discussed:
- Wrapped responses -- these are tokens which can only be used to read a secret once. Provides auditing capability to know if a secret was potentially intercepted by a third party. Could be used to pass through a token in the clear.
How do we automate the "wrap" invocation at the appropriate time?
Requires workflow submitter have access to Vault with separate credentials from their Arvados credentials. We decided we don't want that.
Updated by Peter Amstutz over 5 years ago
Basic options are:
- Submitter gets a single use token or wrapped response from Vault, container looks it up, authorization based on vault user
- Submitter provides identifier of credential, container looks it up, authorization based on Arvados user
First case requires that submitter have access to credential.
Second case makes it possible to limit the ability for the submitter to read a credential.
-> For example, have a scoped token that only allows creating container request and not reading anything back (even the container).