Multi-cluster user database » History » Revision 6
« Previous |
Revision 6/18
(diff)
| Next »
Tom Clegg, 07/25/2019 02:56 PM
Multi-cluster user database¶
It is sometimes desirable to share a single user database across multiple Arvados clusters. For example:- Clusters aaaaa, bbbbb, ccccc, ddddd, eeeee are on different continents, but they use the same upstream authentication providers (ldap/google).
- A down/unreachable cluster should not prevent any user from using other clusters in the group -- even if the down/unreachable cluster is the one where the user's account was initially created.
This requires some changes to login and token validation. (Currently, any given user account has a single "home cluster" that can issue or validate tokens for it.)
Logging in¶
Each user should be able to log in to their account using any cluster, regardless of where/whether they have logged in previously.
To achieve this (without depending real-time communication between clusters) we need all of the participating clusters to agree on a mapping of upstream authentication results to Arvados user UUIDs. For example, if the upstream authentication result is"ldap://ldap.example foo@bar.example"
("ldap://ldap.example assures us this user is foo@bar.example"):
- If a row already exists in the users table with
upstream == "ldap://ldap.example foo@bar.example"
then use that row - Otherwise, create a new row with user UUID "eeeee-tpzed-${sha1part(upstream)}" (where eeeee is a common prefix used by all participating clusters and sha1part() is the first 15 chars of base-36-encoded sha1())
To avoid changing existing user accounts' UUIDs to eeeee-*
, we would do a one-time synchronization of user accounts (and their upstreams) across all participating clusters. For example, if aaaaa-tpzed-012340123401234 exists on cluster aaaaa, we would add that row to bbbbb and ccccc as well. Next time a user logs in to bbbbb with an upstream account matching aaaaa-tpzed-012340123401234, bbbbb would issue a token itself, rather than deferring to aaaaa.
Untrusted remote accounts (the kind that we already have in the users table with foreign UUIDs) have a null upstream field.
uuid | upstream | significance |
aaaaa-tpzed-aaaaaaaaaaaaaaa | google:// foo@bar.example | Imported/migrated from remote cluster aaaaa |
eeeee-tpzed-012340123401234 | ldap://ldap.example foo@baz.example | User didn't exist before the multi-cluster user db system arrived |
ooooo-tpzed-ooooooooooooooo | NULL | Remote user from cluster ooooo (not part of our multi-cluster group) |
Configuration¶
Each cluster needs to know- the uuid prefix to use when creating a new account, e.g., "eeeee"
- additional user uuid prefixes that remote clusters are trusted to validate
Clusters:
aaaaa:
Login:
AssignUUIDPrefix: eeeee
RemoteClusters:
bbbbb:
Proxy: true
Authenticate:
aaaaa: {} # accept tokens issued by bbbbb for users with uuid aaaaa-*
bbbbb: {} # (implied)
eeeee: {} # accept tokens issued by bbbbb for users with uuid eeeee-*
Example: aaaaa needs to validate a token issued by bbbbb.
- Do a callback to bbbbb (or check JWT signature) to confirm bbbbb really issued this token and get the relevant user UUID (result: yes, user uuid is eeeee-tpzed-012340123401234)
- If config Clusters.aaaaa.RemoteClusters.bbbbb.Authenticate.eeeee is present, accept the token
- Otherwise, fetch eeeee's config; if RemoteClusters.bbbbb.Authenticate.eeeee is present, accept the token
- Otherwise, reject the token
Validating tokens¶
Each cluster should be able to validate a token that was issued by a different, currently unreachable, cluster. This contrasts with the current setup, where aaaaa validates tokens issued by bbbbb by doing a callback to bbbbb.
This seems easy enough: instead of random strings, tokens can be [like] JWT, signed by a private key whose public part is known by all clusters. (This would also be more efficient than callbacks, benefiting the mutually-untrusted cluster scenario too.)
Updated by Tom Clegg over 5 years ago · 18 revisions