Project

General

Profile

Multi-cluster user database » History » Version 18

Tom Clegg, 08/20/2019 03:02 PM

1 1 Tom Clegg
h1. Multi-cluster user database
2
3
It is sometimes desirable to share a single user database across multiple Arvados clusters. For example:
4 12 Tom Clegg
* Clusters aaaaa, bbbbb, ccccc, ddddd, eeeee are on different continents, but they use the same upstream authentication providers (ldap/google) and a given user either has access to all clusters, or none.
5 6 Tom Clegg
* A down/unreachable cluster should not prevent any user from using _other_ clusters in the group -- even if the down/unreachable cluster is the one where the user's account was initially created.
6 1 Tom Clegg
7 6 Tom Clegg
This requires some changes to login and token validation. (Currently, any given user account has a single "home cluster" that can issue or validate tokens for it.)
8 1 Tom Clegg
9 6 Tom Clegg
h2. Logging in
10 1 Tom Clegg
11 6 Tom Clegg
Each user should be able to log in to their account using any cluster, regardless of where/whether they have logged in previously.
12 1 Tom Clegg
13 17 Tom Clegg
To achieve this, the participating clusters agree on a single authoritative "login cluster" where the user accounts are stored, and the others hand off authentication and token-issuing to that cluster.
14 1 Tom Clegg
15 17 Tom Clegg
If aaaaa, bbbbb, ccccc, and ddddd designate eeeee as the login cluster:
16
* Login process on eeeee: unchanged
17
* Login process on aaaaa:
18
** aaaaa-workbench presents a link to aaaaa-controller's login endpoint (unchanged)
19
** aaaaa-controller redirects to eeeee-controller's login endpoint (leaving the "return_to" parameter alone)
20
** eeeee login process proceeds as usual: proxy through to RailsAPI, update the users table, redirect to aaaaa-workbench with the new token ("v2/eeeee-...")
21
* When aaaaa-workbench presents a token to aaaaa-controller, the token will have an "eeeee" prefix, so aaaaa-controller will ask eeeee to validate it, and cache the result
22 12 Tom Clegg
23 18 Tom Clegg
h2. Token validation
24
25
Login tokens are validated as before:
26
* All tokens start with "v2/eeeee-..." and are validated by calling back to eeeee
27
28
Per-container tokens are issued and validated as before:
29
* A container running on aaaaa has a "v2/aaaaa-..." token which is validated by checking aaaaa's local database
30
31 16 Tom Clegg
h2. Configuration
32
33 17 Tom Clegg
Non-login clusters need to know
34
* the login cluster's ID and API endpoint
35 1 Tom Clegg
36 8 Tom Clegg
<pre><code class="yaml">
37 1 Tom Clegg
Clusters:
38 17 Tom Clegg
  aaaaa:
39 12 Tom Clegg
    Login:
40 17 Tom Clegg
      LoginCluster: eeeee
41 1 Tom Clegg
    RemoteClusters:
42 17 Tom Clegg
      eeeee:
43 6 Tom Clegg
        Proxy: true
44 17 Tom Clegg
        Host: eeeee.arvadosapi.com
45 10 Tom Clegg
</code></pre>
46
47 17 Tom Clegg
h2. Distributed database
48 12 Tom Clegg
49 17 Tom Clegg
In the future, this can be extended by having all clusters in the group use a single distributed database for storing users and tokens with "eeeee" prefixes. For now, each non-master cluster will just continue to update its cached user record (if stale at token validation time) from the master cluster, and proxy update requests to the master.
50 12 Tom Clegg
51
h2. Migration
52
53 17 Tom Clegg
Admin migration tool (given an admin token for each cluster) should load user lists from all clusters in the group, and merge non-eeeee accounts into new/existing eeeee-* accounts.
54
* The identity_url field no longer provides upstream auth info, so we'll need to match accounts by email address.