Project

General

Profile

Credential storage » History » Version 5

Peter Amstutz, 06/14/2024 08:08 PM

1 1 Peter Amstutz
h1. Credential storage
2
3
h2. Background
4
5
In order to implement [[Objects as pseudo-blocks in Keep]], we need a way to store credentials so that Arvados can authenticate to other systems, e.g. AWS S3.
6
7
The current system for managing secrets is specific to workflows and deletes the secret as soon as the workflow is finished.  However, we require a credential storage system that can be accessed by keepstore.
8
9
User perspective:
10
11
Want to be able to manage credentials in workbench, and then Arvados services that need it can look it them up.  The motivating use case is AWS credentials that have a key id/key secret pair (much like arvados API key uuid / secret) so that we can easily access objects in external S3 buckets.
12
13
h2. Requirements
14
15
* Secrets should have an id for what type of thing they are, e.g. AWS credentials
16
* Secrets should have an optional scope.  E.g. want to be able to provide different credentials for different resources, buckets, etc.
17
* Should the secret material itself be simple text column or a JSON object?  For example AWS secret id/secret is a pair
18
* Different users should have different views of what secrets are available based on Arvados permissions.  User should be able to share secrets at different levels of access, e.g.
19
** can_read -- system services can fetch the credential on behalf of the user, but they cannot fetch it directly through the API
20
** can_write -- user can update the credential, but still not read it back
21
** can_manage -- user can grant permissions to the credential, but still not read it back
22 2 Peter Amstutz
* Secrets should be write-only as much as possible, system services can retrieve secrets, but users cannot except in special circumstances
23
** want a way to use secrets in workflows, which means they can be exposed if developers are careless.  This is true of our current secrets support as well (it's inherently impossible to prevent it from being leaked in user-provided code if someone is really trying, but we'll at least be able to keep a record of which workflows accessed those secrets).
24 3 Peter Amstutz
25
h2. Security
26
27
Start with our threat model.
28
29
These are not passwords, these are credentials that will be provided to other services on behalf of the user, which means we have to be able to get them in the clear, we can't hash them.  Unfortunately a google search for "how to store secrets in a database" comes up dozens of pages telling you not to store cleartext passwords and how to hash passwords and not so much advice on how to do what we need to do.
30
31
Ways credentials could leak
32
33
* Attacker uses Arvados API as a normal user
34
** Should be restricted accessing credentials by normal access controls.  
35
** As previously noted, if we want to provide credentials to a user-supplied workflows, it is impossible defend against, so we have to exclude consider users who are authorized to use the credentials being able to do anything they want with those credentials from the threat model
36
* Attacker uses Arvados API as a superuser
37
** Admins can already mostly access anything
38
** The existing secret_mounts only makes it inconvenient for admins, if they can access the container's runtime token, they can fetch secret mounts
39
** Boxing out admins via the the API is probably possible but may require sealing additional holes (e.g. placing stricter limits on admins accessing API tokens of other users)
40
* Attacker gains access to the database
41
** Would be able to use SQL to read any column.  E.g. currently secret_mounts is not encrypted, so it would be vulnerable.
42
** To block this, columns need to be encrypted.
43
* Attacker gains access to the node the database is running on
44
** Same as remote database access, except attacker additionally has access to the /etc/arvados/config.yml and any credentials kept in there.
45
* Attacker can intercept communications with the database and/or API server
46
** This is probably game over for our entire security model, not just secrets handling.  We rely on TLS to prevent this.
47
48
h2. Implementation
49 4 Peter Amstutz
50
The first rule of security software is don't build it yourself.  Need to do some research and see if there's something we could plug in to and make part of our stack.
51
52 5 Peter Amstutz
https://github.com/getsops/sops
53 4 Peter Amstutz
54 1 Peter Amstutz
https://github.com/Infisical/infisical
55 5 Peter Amstutz
56
HashiCorp Vault would have been something to consider but licensing has changed which would require us to use an older version and/or a fork.