Project

General

Profile

Multi-cluster user database » History » Revision 17

Revision 16 (Tom Clegg, 08/07/2019 05:46 PM) → Revision 17/18 (Tom Clegg, 08/20/2019 02:23 PM)

h1. Multi-cluster user database 

 It is sometimes desirable to share a single user database across multiple Arvados clusters. For example: 
 * Clusters aaaaa, bbbbb, ccccc, ddddd, eeeee are on different continents, but they use the same upstream authentication providers (ldap/google) and a given user either has access to all clusters, or none. 
 * A down/unreachable cluster should not prevent any user from using _other_ clusters in the group -- even if the down/unreachable cluster is the one where the user's account was initially created. 

 This requires some changes to login and token validation. (Currently, any given user account has a single "home cluster" that can issue or validate tokens for it.) 

 h2. Logging in 

 Each user should be able to log in to their account using any cluster, regardless of where/whether they have logged in previously. 

 To achieve this, this (without depending real-time communication between clusters) the participating clusters need to agree on a single authoritative "login cluster" where mapping of upstream authentication results to Arvados user UUIDs. For example, if the upstream authentication result is @"foo@bar.example"@ ("an upstream auth provider assures us this user accounts are stored, is foo@bar.example"): 
 # Generate a UUID "eeeee-tpzed-${sha1part(upstream)}" (where eeeee is a common prefix used by all participating clusters and sha1part() is the others hand off authentication and token-issuing first 15 chars of base-36-encoded sha1()) 
 # If it doesn't already exist, add a row to that cluster. 

 the users table with this UUID 
 # If aaaaa, bbbbb, ccccc, and ddddd designate eeeee another row exists in the users table with the same upstream (or same identity_url) but a different UUID, [offer to] merge the old account's data/objects/permissions into the new account (it isn't possible to log into the old account any more, but we know it belongs to the same person as the login cluster: 
 * Login process new account). 

 This also makes upstream authentication providers equivalent: as long as they report the same IDs (email addresses), users/sites can switch upstream providers on eeeee: unchanged the fly without having to merge or migrate accounts. 

 Notes 
 * Login process the "upstream" field is similar to identity_url as initially conceived. Since #4601, identity_url has been an opaque SSO-generated UUID, with no info about upstream -- so we will rely on aaaaa: 
 ** aaaaa-workbench presents a link it to aaaaa-controller's login endpoint (unchanged) 
 ** aaaaa-controller redirects detect "same upstream as old account that needs to eeeee-controller's login endpoint (leaving be migrated" but we can't use it to generate the "return_to" parameter alone) 
 ** eeeee login process proceeds same user UUID as usual: proxy through to RailsAPI, update other clusters, hence the users table, redirect to aaaaa-workbench with the need for a new token ("v2/eeeee-...") "upstream" field 
 * When aaaaa-workbench presents "remote" accounts (the kind that we already have in the users table with foreign UUIDs) have a token to aaaaa-controller, the token null identity_url field, and will also have an "eeeee" prefix, so aaaaa-controller will ask eeeee a null upstream field 

 |uuid                          |upstream          |identity_url                  |significance                 | 
 |eeeee-tpzed-012340123401234 |foo@bar.example |login-tpzed-aaaaaaaaaaaaaaa |Newly created user account | 
 |aaaaa-tpzed-aaaaaaaaaaaaaaa |NULL              |login-tpzed-aaaaaaaaaaaaaaa |Old user account (can't log in to validate it, and cache the result 

 this any more - contents should be migrated to eeeee-*) | 
 |ooooo-tpzed-ooooooooooooooo |NULL              |NULL                          |Remote user from cluster ooooo (not part of our multi-cluster group) | 

 h2. Configuration 

 Non-login clusters need Each cluster needs to know 
 * the login cluster's ID and API endpoint uuid prefix to use when creating a new account, e.g., "eeeee" (this will be the initial "master" cluster -- see below) 

 <pre><code class="yaml"> 
 Clusters: 
   aaaaa: bbbbb: 
     Login: 
       LoginCluster: AssignUUIDPrefix: eeeee 
 </code></pre> 

 The master cluster needs to know 
 * which other clusters are authorized to issue tokens for "eeeee-tpzed-*" users 

 <pre><code class="yaml"> 
 Clusters: 
   eeeee: 
     RemoteClusters: 
       eeeee: bbbbb: 
         Proxy: true 
         Host: eeeee.arvadosapi.com AuthenticateLocalUsers: true # all clusters should accept tokens issued by bbbbb for users with uuid eeeee-* 
 </code></pre> 

 Example: aaaaa needs to validate a token issued by bbbbb. 
 * Do a callback to bbbbb (or check JWT signature) to confirm bbbbb really issued this token and get the relevant user UUID (result: yes, user uuid is eeeee-tpzed-012340123401234). 
 * Fetch eeeee's config. 
 * If RemoteClusters.bbbbb.AuthenticateLocalUsers is true, accept the token. Otherwise, reject the token. 
 * If the token is accepted, update the local cache of the user record from eeeee. 

 h2. Distributed database Validating tokens 

 In (...even when the future, this issuing cluster is unreachable) 

 Each cluster should be able to validate a token that was issued by a different, currently unreachable, cluster. This contrasts with the current setup, where aaaaa validates tokens issued by bbbbb by doing a callback to bbbbb. 

 This seems easy enough: instead of random strings, tokens can be extended [like] "JSON Web Tokens":https://jwt.io/, signed by having a private key whose public part is known by all clusters clusters. (This would also be more efficient than callbacks, benefiting the mutually-untrusted cluster scenario too.) 

 h2. "Master" cluster 

 The authoritative place to store/load per-user information (preferences, and the "this email is just an alternate way to log in to a different account" marker) is: 
 * ...for callers outside the "eeeee" group use of clusters: the "eeeee" cluster 
 * ...for callers inside the "eeeee" group of clusters, for now: a single manually designated "master" cluster (probably "eeeee") 
 * ...for callers inside the "eeeee" group of clusters, in future: a group-wide distributed database for storing users and tokens with "eeeee" prefixes. For now, whose default/initial "master" is eeeee 

 Until a distributed database is implemented, each non-master cluster will just continue to can update its cached user record (if stale at login or token validation time) from the master cluster, and proxy update requests to the master. 

 h2. Migration 

 Admin migration tool (given an admin token should, for each cluster) should load record in the user lists database: 
 * Check whether the UUID has been generated from all clusters in the group, and merge non-eeeee accounts into new/existing eeeee-* accounts. email address as described above (if so, do nothing) 
 * The identity_url field no longer provides upstream auth info, so we'll need Check whether another user record exists with the generated UUID (if not, create one) 
 * [Prompt and] change existing references to match accounts by the old UUID to the new generated UUID (_if the existing email address. 
 field is trusted,_ this should include tokens, SSH keys, etc.)