Story #11453

Federated user identity which works across a network of Arvados clusters

Added by Tom Morris about 1 year ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
06/20/2017
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0

Description

Basic elements:
- a single login server which provides authentication for all clusters in the network
- a single user UUID is used across all nodes in the cluster.

API server needs two additional features:
1. Validate salted token by contacting origin cluster
2. As an origin cluster, validate a received token from a remote cluster

Validation requests return the user record which is used to populate the local user table, along with an expiration time after which revalidation should occur.

Draft: Federated identity

Migration process from local identity to network identity is separate


Subtasks

Task #11874: [Spike] Prototype federated identityResolved

Task #12424: Migration process to convert local user IDs to network cluster IDsClosed

Task #12455: Validate v2-format salted tokensResolvedTom Clegg

Task #12440: Review 11453-federated-tokensResolvedTom Clegg


Related issues

Blocks Arvados - Story #11454: Support federated search across a set of Arvados clustersResolved2017-04-11

Blocks Arvados - Story #12705: Documentation/helper scripts for migrating users to federated identityResolved2018-01-11

Associated revisions

Revision e21d1ca9
Added by Tom Clegg 5 months ago

Merge branch '11453-federated-tokens'

refs #11453

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

Revision 3a3fc5ac (diff)
Added by Tom Clegg 3 months ago

Fix wrong cluster label on search page when authenticated remotely.

Add uuid prefix to discovery doc.

refs #11453

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Clegg about 1 year ago

  • Description updated (diff)

#2 Updated by Tom Morris 8 months ago

  • Target version set to Arvados Future Sprints

#3 Updated by Tom Morris 7 months ago

  • Description updated (diff)

#4 Updated by Tom Morris 7 months ago

  • Description updated (diff)

#5 Updated by Tom Morris 7 months ago

  • Description updated (diff)
  • Story points set to 2.0

#6 Updated by Tom Morris 7 months ago

  • Assigned To set to Tom Clegg
  • Target version changed from Arvados Future Sprints to 2017-10-25 Sprint

#7 Updated by Tom Clegg 6 months ago

  • Status changed from New to In Progress

#8 Updated by Peter Amstutz 6 months ago

  • Target version changed from 2017-10-25 Sprint to 2017-11-08 Sprint

#9 Updated by Peter Amstutz 6 months ago

  • When validating a remote user, is_admin is forced to be false, but is_active is accepted from the remote site. I understand if the remote user is not active we shouldn't accept the token, however this also ignores the local configuration Rails.configuration.new_users_are_active when the user is initially created, which prevents the cluster admin from requiring manual activation of remote users.
  • It looks like once a user account is created, a user could create a new token one the remote cluster and start using that, bypassing validation with the home cluster. Is that desirable / intentional?
  • test "list readable groups with salted token" checks for the presence of "all users" and absence of "system group" but not any other group fixtures. Since these are special group ids they could conceivably have special behavior that subverts the test.
  • RemoteUserAccountTest would benefit from some comments. I think what it is doing is setting up a web server in a separate thread so that the "auth" method can call "ArvadosApiToken" which calls "ApiClientAuthorization.validate" which contacts the stub web server? And the behavior of the stub web server is predetermined, so we're only testing the client side?
  • In validate() you call uuid[0..4] in a couple of places, it might be easier to understand if it was assigned to a variable with a descriptive name like "token_uuid_prefix".

#10 Updated by Peter Amstutz 6 months ago

test 'remote api server is not an api server' should return status 200.

should bail out if 'uuid' or 'secret' is empty or nil (in case of maelformed v2/ tokens)

should test maelformed v2/ tokens.

is there a test that re-validates an expired token?

what happens if a remote user has an email address or username that conflicts with a local user?

what happens if the remote user's email address or username changes?

(currently, it looks like usernames are required to be unique, email is not).

it looks like the remote auth token caching doesn't set the "secret" field, so subsequent requests won't use it but (possibly) re-validate instead.

need a test for auth token caching.

what's the rationale for putting the "validate token from remote cluster" logic in load_read_auths of application_controller instead of in the arvados_api_token middleware?

The "validate token from remote cluster" logic should ensure that the HTTP method is GET.

#11 Updated by Peter Amstutz 6 months ago

All new tokens given to the client should be v2/ format tokens so the client can tell if it is talking to the home cluster or a remote cluster and determine if it needs to salt the token or not.

#12 Updated by Tom Clegg 6 months ago

Peter Amstutz wrote:

  • When validating a remote user, is_admin is forced to be false, but is_active is accepted from the remote site. I understand if the remote user is not active we shouldn't accept the token, however this also ignores the local configuration Rails.configuration.new_users_are_active when the user is initially created, which prevents the cluster admin from requiring manual activation of remote users.

Right. As it stands, if you accept users from remote site X, then you rely on site X's activation policy. Since the default config accepts users from *.arvadosapi.com, this seems rather permissive.

(todo?) set is_active=remote_is_active if (!remote_is_active or local_auto_activate); otherwise, leave it alone so local admin can activate

(todo) consider other activation stuff, like adding to "all users" group

  • It looks like once a user account is created, a user could create a new token one the remote cluster and start using that, bypassing validation with the home cluster. Is that desirable / intentional?

This does sound dubious in that it could effectively bypass the home cluster's token expiry time.

One such procedure (login with google at remote's SSO → create session at remote's API based on matching identity_url) can be fixed (in user_sessions_controller#create) by matching existing users on identity_url only if uuid[0..4] matches Rails.configuration.uuid_prefix.

For UX (avoiding accidentally creating a local account instead of using a remote account), for now we're assuming nobody will point their web browser at a "remote" api server.

Do you have other "create a new token" procedures in mind?

  • test "list readable groups with salted token" checks for the presence of "all users" and absence of "system group" but not any other group fixtures. Since these are special group ids they could conceivably have special behavior that subverts the test.

(todo) add checks for other fixtures

  • RemoteUserAccountTest would benefit from some comments. I think what it is doing is setting up a web server in a separate thread so that the "auth" method can call "ArvadosApiToken" which calls "ApiClientAuthorization.validate" which contacts the stub web server? And the behavior of the stub web server is predetermined, so we're only testing the client side?

(todo) add comments here

Yes. So far we just have separate tests for the server side and the client side. An integration test will require multiple uuid-prefixes = multiple apiserver instances = multiple databases = ... some more work.

  • In validate() you call uuid[0..4] in a couple of places, it might be easier to understand if it was assigned to a variable with a descriptive name like "token_uuid_prefix".

(todo)

#13 Updated by Tom Clegg 6 months ago

  • Target version changed from 2017-11-08 Sprint to 2017-11-22 Sprint

#14 Updated by Tom Clegg 5 months ago

  • Target version changed from 2017-11-22 Sprint to 2017-12-06 Sprint

#15 Updated by Peter Amstutz 5 months ago

Related to #12627 it might be a good idea to move more of the token handling into the middleware instead of the application controller (at least from an code auditing standpoint it makes it easier to understand the request handler process).

#16 Updated by Peter Amstutz 5 months ago

  • Blocks Story #12705: Documentation/helper scripts for migrating users to federated identity added

#18 Updated by Tom Clegg 5 months ago

11453-federated-tokens @ f3dc89653597f7f6de480850231ea1f6b991c8aa

changes:
  • activation
    • if remote user is active, and local mirror entry already exists and is active, user is active locally
    • if remote user is active, and local config is auto-activate, then remote user is activated immediately
    • if remote user is inactive, remote user is deactivated if necessary
  • identity_url is not copied from remote (so remote user can't bypass remote auth -- or get a "trusted client" token -- by logging in via local SSO)
  • email and username are not copied from remote (so they don't conflict with local)
  • salted token is cached locally so it can be validated on subsequent calls without repeating the remote verification
  • "verify salted token" checks http method == GET
  • token checks moved to middleware
  • test remote-auth cache + expiry + un-expiry, malformed tokens, ignore extra token fields
  • other cleanup/comments noted above

All new tokens given to the client should be v2/ format tokens so the client can tell if it is talking to the home cluster or a remote cluster and determine if it needs to salt the token or not.

I'm not sure I follow this. I see how this could be a convenience for a client (just store a list of tokens, instead of a mapping of cluster→token). Is that what you mean, or are you saying it's necessary?

#19 Updated by Tom Clegg 5 months ago

  • Target version changed from 2017-12-06 Sprint to 2017-12-20 Sprint

#20 Updated by Tom Clegg 5 months ago

11453-federated-tokens @ ae36f78143f258c4eecbee623efb2c2bfcd303a8

#21 Updated by Peter Amstutz 5 months ago

Updates look good. I did some manual testing between two arvbox instances.

Since they use self-signed certificates, I added this (you should consider applying it):

--- a/services/api/app/models/api_client_authorization.rb
+++ b/services/api/app/models/api_client_authorization.rb
@@ -133,6 +133,9 @@ class ApiClientAuthorization < ArvadosModel
       # [re]validate it.
       begin
         clnt = HTTPClient.new
+        if Rails.configuration.sso_insecure
+          clnt.ssl_config.verify_mode = OpenSSL::SSL::VERIFY_NONE
+        end
         remote_user = SafeJSON.load(
           clnt.get_content('https://' + host + '/arvados/v1/users/current',
                            {'remote' => Rails.configuration.uuid_prefix},

I created a token in Python:

>>> import hmac
>>> import hashlib
>>> h = hmac.new('4yxuv7ra2ge7pndh3a075nwa2nd8endbm1kf7v73dyt0yiws2v', '1bq65', hashlib.sha1)
>>> "v2/1lzl6-gj3su-evhdy1tn20jjb0d/"+h.hexdigest()
'v2/1lzl6-gj3su-evhdy1tn20jjb0d/3586b7802b2a37abafd056a019ba5307636a31b9'

This worked! I was able to use "arv user current".

I was also able to log into workbench by adding "?api_token=..."

Notes:

  • Your note has a mistake, in your branch you are copying the email address. However, I don't think Arvados cares if there are two accounts with the same email addresses. We can probably update email address when refreshing the user record.
  • We are not assigning a username at all. This repositories. I suggest using the upstream username if available, or creating a unique username if not. We probably don't want to ever change the username once it has been assigned.
  • Should we also synchronize the user's public SSH keys from upstream?

All new tokens given to the client should be v2/ format tokens so the client can tell if it is talking to the home cluster or a remote cluster and determine if it needs to salt the token or not.

I'm not sure I follow this. I see how this could be a convenience for a client (just store a list of tokens, instead of a mapping of cluster→token). Is that what you mean, or are you saying it's necessary?

I can't remember now if I had a specific case in mind where it was strictly necessary. There's some benefit to making the tokens more self-describing, but I don't think it blocks anything.

#22 Updated by Peter Amstutz 5 months ago

Note on #12758 (Ensure federated identity + group sync tool work well together)

11453-federated-tokens does not include synchronizing federated group memberships, although it has some minimal support for queries necessary to support it in the future.

This means the group sync tool will still need to be run on every cluster, but will update group membership for federated user accounts instead of local ones.

#23 Updated by Tom Clegg 5 months ago

11453-federated-tokens @ 1b993cdda270016bcf82fcad7f2168659345aa0e
  • sso_insecure applies to remote auth too
  • unique username is assigned using the usual rules (based on remote username if available, otherwise based on email)
  • included "email is preserved" in test case

Copying SSH keys sounds like a good idea. After merging this, right?

#24 Updated by Peter Amstutz 5 months ago

Tom Clegg wrote:

11453-federated-tokens @ 1b993cdda270016bcf82fcad7f2168659345aa0e
  • sso_insecure applies to remote auth too

Thanks.

  • unique username is assigned using the usual rules (based on remote username if available, otherwise based on email)

Looks like it doesn't check if we already assigned a username? Which might lead to it assigning a new username on every token refresh!

Copying SSH keys sounds like a good idea. After merging this, right?

Yes, was just a thought, not a blocker.

#25 Updated by Tom Clegg 5 months ago

Peter Amstutz wrote:

Looks like it doesn't check if we already assigned a username? Which might lead to it assigning a new username on every token refresh!

The find_or_create_by() block only runs in the "create" case. I'll add a test though.

#26 Updated by Tom Clegg 5 months ago

#27 Updated by Tom Clegg 5 months ago

#29 Updated by Tom Clegg 4 months ago

  • Target version changed from 2017-12-20 Sprint to 2018-01-17 Sprint

#30 Updated by Tom Morris 4 months ago

  • Status changed from In Progress to Closed

Also available in: Atom PDF