Multi-cluster user database » History » Version 6
Tom Clegg, 07/25/2019 02:56 PM
1 | 1 | Tom Clegg | h1. Multi-cluster user database |
---|---|---|---|
2 | |||
3 | It is sometimes desirable to share a single user database across multiple Arvados clusters. For example: |
||
4 | 6 | Tom Clegg | * Clusters aaaaa, bbbbb, ccccc, ddddd, eeeee are on different continents, but they use the same upstream authentication providers (ldap/google). |
5 | * A down/unreachable cluster should not prevent any user from using _other_ clusters in the group -- even if the down/unreachable cluster is the one where the user's account was initially created. |
||
6 | 1 | Tom Clegg | |
7 | 6 | Tom Clegg | This requires some changes to login and token validation. (Currently, any given user account has a single "home cluster" that can issue or validate tokens for it.) |
8 | 1 | Tom Clegg | |
9 | 6 | Tom Clegg | h2. Logging in |
10 | 1 | Tom Clegg | |
11 | 6 | Tom Clegg | Each user should be able to log in to their account using any cluster, regardless of where/whether they have logged in previously. |
12 | 1 | Tom Clegg | |
13 | To achieve this (without depending real-time communication between clusters) we need all of the participating clusters to agree on a mapping of upstream authentication results to Arvados user UUIDs. For example, if the upstream authentication result is @"ldap://ldap.example foo@bar.example"@ ("ldap://ldap.example assures us this user is foo@bar.example"): |
||
14 | # If a row already exists in the users table with <code>upstream == "ldap://ldap.example foo@bar.example"</code> then use that row |
||
15 | 4 | Tom Clegg | # Otherwise, create a new row with user UUID "eeeee-tpzed-${sha1part(upstream)}" (where eeeee is a common prefix used by all participating clusters and sha1part() is the first 15 chars of base-36-encoded sha1()) |
16 | 1 | Tom Clegg | |
17 | 4 | Tom Clegg | To avoid changing existing user accounts' UUIDs to @eeeee-*@, we would do a one-time synchronization of user accounts (and their upstreams) across all participating clusters. For example, if aaaaa-tpzed-012340123401234 exists on cluster aaaaa, we would add that row to bbbbb and ccccc as well. Next time a user logs in to bbbbb with an upstream account matching aaaaa-tpzed-012340123401234, bbbbb would issue a token itself, rather than deferring to aaaaa. |
18 | 1 | Tom Clegg | |
19 | 2 | Tom Clegg | Untrusted remote accounts (the kind that we already have in the users table with foreign UUIDs) have a null upstream field. |
20 | |||
21 | 1 | Tom Clegg | |uuid |upstream |significance | |
22 | 2 | Tom Clegg | |aaaaa-tpzed-aaaaaaaaaaaaaaa |google:// foo@bar.example |Imported/migrated from remote cluster aaaaa | |
23 | 1 | Tom Clegg | |eeeee-tpzed-012340123401234 |ldap://ldap.example foo@baz.example |User didn't exist before the multi-cluster user db system arrived | |
24 | |ooooo-tpzed-ooooooooooooooo |NULL |Remote user from cluster ooooo (not part of our multi-cluster group) | |
||
25 | |||
26 | 4 | Tom Clegg | h2. Configuration |
27 | 2 | Tom Clegg | |
28 | Each cluster needs to know |
||
29 | * the uuid prefix to use when creating a new account, e.g., "eeeee" |
||
30 | 3 | Tom Clegg | * additional user uuid prefixes that remote clusters are trusted to validate |
31 | 4 | Tom Clegg | |
32 | 3 | Tom Clegg | <pre><code class="yaml"> |
33 | 2 | Tom Clegg | Clusters: |
34 | aaaaa: |
||
35 | 6 | Tom Clegg | Login: |
36 | AssignUUIDPrefix: eeeee |
||
37 | 1 | Tom Clegg | RemoteClusters: |
38 | bbbbb: |
||
39 | Proxy: true |
||
40 | Authenticate: |
||
41 | aaaaa: {} # accept tokens issued by bbbbb for users with uuid aaaaa-* |
||
42 | bbbbb: {} # (implied) |
||
43 | eeeee: {} # accept tokens issued by bbbbb for users with uuid eeeee-* |
||
44 | </code></pre> |
||
45 | |||
46 | Example: aaaaa needs to validate a token issued by bbbbb. |
||
47 | * Do a callback to bbbbb (or check JWT signature) to confirm bbbbb really issued this token and get the relevant user UUID (result: yes, user uuid is eeeee-tpzed-012340123401234) |
||
48 | * If config Clusters.aaaaa.RemoteClusters.bbbbb.Authenticate.eeeee is present, accept the token |
||
49 | 4 | Tom Clegg | * Otherwise, fetch eeeee's config; if RemoteClusters.bbbbb.Authenticate.eeeee is present, accept the token |
50 | 1 | Tom Clegg | * Otherwise, reject the token |
51 | 6 | Tom Clegg | |
52 | h2. Validating tokens |
||
53 | |||
54 | Each cluster should be able to validate a token that was issued by a different, currently unreachable, cluster. This contrasts with the current setup, where aaaaa validates tokens issued by bbbbb by doing a callback to bbbbb. |
||
55 | |||
56 | This seems easy enough: instead of random strings, tokens can be [like] JWT, signed by a private key whose public part is known by all clusters. (This would also be more efficient than callbacks, benefiting the mutually-untrusted cluster scenario too.) |