Bug #20831
closedUser unsetup method does not consistently remove permissions in a federation
Added by Brett Smith over 1 year ago. Updated about 1 year ago.
Description
In a federation with a login cluster, if you call a user's unsetup method on the login cluster, they will still have permission links on other clusters in the federation. This is undesirable for all the same reasons we started removing them originally in #19501. Permission links should be removed across the federation.
Updated by Brett Smith over 1 year ago
- Related to Bug #19501: unsetup should remove all sharing permissions for the deactivated user added
Updated by Peter Amstutz about 1 year ago
- Target version changed from Future to Development 2023-11-29 sprint
Updated by Peter Amstutz about 1 year ago
As part of the user account synchronization ApiClientAuthorization.validate
, it will unsetup the user:
if user.is_invited && !remote_user['is_invited'] # Remote user is not "invited" state, they should be unsetup, which # also makes them inactive. user.unsetup else
However, the UsersController.batch_update
method doesn't do this.
I think what we want is a new method like User.handle_remote_record
is used by both the token validation and batch_update so that the logic for things like user activation/deactivation and usernames is consistent.
Updated by Peter Amstutz about 1 year ago
- Assigned To changed from Lucas Di Pentima to Peter Amstutz
Updated by Peter Amstutz about 1 year ago
20831-user-table-locks @ 282562ff358c549980a48ccca41944039f86483a
- All agreed upon points are implemented / addressed.
- Behavior for updating user records that belong to remote clusters now uses the same code path for both token
validate
andbatch_update
- Behavior for updating user records that belong to remote clusters now uses the same code path for both token
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- n/a
- Code is tested and passing, both automated and manual, what manual testing was done is described
- passes existing tests, which are pretty thorough
- Documentation has been updated.
- n/a
- Behaves appropriately at the intended scale (describe intended scale).
- now takes a row lock on the user record before updating it. this shouldn't affect scaling but it does mean updates that would have raced (believed to be the cause of #21059) should now be serialized.
- Considered backwards and forwards compatibility issues between client and server.
- when the remote cluster indicates an account has been deactivated, it should now be correctly deactivated on the satellite cluster, which was always the intended behavior but does represent a minor behavior change.
- Follows our coding standards and GUI style guidelines.
- yes
This change is pretty straightforward, the logic for batch update and user update from token validate has been combined into a single method. It also now takes a row lock on the user record, which is intended to solve multiple-activation email bug #21059.
Updated by Peter Amstutz about 1 year ago
- Status changed from New to In Progress
Updated by Peter Amstutz about 1 year ago
20831-user-table-locks @ e42bf8fbe66f822066e13c08b346005a52e1aa4a
- All agreed upon points are implemented / addressed.
- Behavior for updating user records that belong to remote clusters now uses the same code path for both token
validate
andbatch_update
- Behavior for updating user records that belong to remote clusters now uses the same code path for both token
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- n/a
- Code is tested and passing, both automated and manual, what manual testing was done is described
- passes existing tests, including controller tests, which are pretty thorough
- Documentation has been updated.
- n/a
- Behaves appropriately at the intended scale (describe intended scale).
- now takes a row lock on the user record before updating it. this shouldn't affect scaling but it does mean updates that would have raced (believed to be the cause of #21059) should now be serialized.
- Considered backwards and forwards compatibility issues between client and server.
- when the remote cluster indicates an account has been deactivated, it should now be correctly deactivated on the satellite cluster, which was always the intended behavior but does represent a minor behavior change.
- it turns out there was an issue, the is_admin and is_invited fields were not always returned; the Go SDK was turning them into "false" in this case which caused it to incorrectly deactivate the user. The User.IsAdmin and User.IsInvited fields are now pointers. The code using those fields (only a handful of places) has been updated.
- Since it handles situations where is_admin and is_invited are present or not present, it should be backwards compatible. However, to get the new user deactivation behavior, you need to update the Login cluster so that it starts sending those fields in situations where it currently isn't.
- Follows our coding standards and GUI style guidelines.
- yes
This change is pretty straightforward, the logic for batch update and user update from token validate has been combined into a single method. It also now takes a row lock on the user record, which is intended to solve multiple-activation email bug #21059.
Updated by Tom Clegg about 1 year ago
As discussed offline, I still think changing IsAdmin to a pointer (implicitly using nil to mean "field not populated") is not appropriate. We've been eliminating the ad-hoc mix of pointers and values by sticking with / changing to values. Changing that strategy to edit-warring individual fields as needed seems like the worst side to land on, especially since this is an SDK, purportedly usable by code other than arvados itself. Sync-groups etc. (see failing tests) aren't the only things that will break.
But given that the problem here is that the correct way to interpret a false value depends on the remote cluster's API version, I think the right solution is probably to check the remote cluster's API version using the discovery doc.
Updated by Peter Amstutz about 1 year ago
Tom Clegg wrote in #note-11:
As discussed offline, I still think changing IsAdmin to a pointer (implicitly using nil to mean "field not populated") is not appropriate. We've been eliminating the ad-hoc mix of pointers and values by sticking with / changing to values. Changing that strategy to edit-warring individual fields as needed seems like the worst side to land on, especially since this is an SDK, purportedly usable by code other than arvados itself. Sync-groups etc. (see failing tests) aren't the only things that will break.
But given that the problem here is that the correct way to interpret a false value depends on the remote cluster's API version, I think the right solution is probably to check the remote cluster's API version using the discovery doc.
It's more complicated than that, because sometimes the is_admin and is_invited fields are returned, and when that does happen, it is better if we don't ignore them (because the whole point of this ticket was to correctly "unsetup" users when we find out they have been disabled upstream).
So now it (a) it checks the discovery document and assumes valid is_admin and is_invited for newer API server revisions and (b) for backwards compatibility, it checks for the conditions where we expect valid is_admin and is_invited to be returned.
My main reservation about this solution is that if there is a case I didn't account for where it expects the fields to be returned but they are not sent, users will be accidentally deactivated because the logic is based on expected behavior and not ground truth. I guess that risk is acceptable to avoid introducing pointers in the User struct.
20831-user-table-locks @ 135bada0fe08de2b678ede684d43a155c4351ed3
Updated by Lucas Di Pentima about 1 year ago
There are failing tests, so I'm assigning the review task back to you so I don't get notified until it's really ready.
Updated by Peter Amstutz about 1 year ago
0b0a1bec011d8fb19205f53d78848fee800f2d88
Passing tests, now checks the discover doc revision and current user to decide whether to treat is_admin and is_invited fields as valid or not, as discussed in note 11.
Updated by Lucas Di Pentima about 1 year ago
Just a few comments:
arvados/lib/controller/federation/conn.go
- L708: Should that
fmt.Printf()
call be passed to the logging facility instead? - L728: Shouldn't
err
be checked too?
- L708: Should that
- Do you think
User.update_remote_user()
would benefit of having tests written for it?
Updated by Peter Amstutz about 1 year ago
Lucas Di Pentima wrote in #note-16:
Just a few comments:
arvados/lib/controller/federation/conn.go
- L708: Should that
fmt.Printf()
call be passed to the logging facility instead?
That was a debug printf that was left in by mistake, I removed it.
- L728: Shouldn't
err
be checked too?
Yes, added an error check.
- Do you think
User.update_remote_user()
would benefit of having tests written for it?
api/test/integration/remote_user_test.rb
and functional/arvados/v1/users_controller_test.rb
indirectly invoke User.update_remote_user()
with a variety of remote user cases, I don't know what additional tests I would write that are not covered by the existing tests.
Updated by Tom Clegg about 1 year ago
- in rpc.Conn, we might not want to cache the discovery doc forever -- maybe refresh after 1h?
- rpc.Conn should be goroutine-safe, so discoveryDocument access should be protected with a mutex
- in federation,
conn.chooseBackend(conn.cluster.ClusterID)
should be justconn.local
(both in the new code and in the VocabularyGet method above which it was probably copied from)
Updated by Peter Amstutz about 1 year ago
Tom Clegg wrote in #note-18:
Not that anyone asked me, but
- in rpc.Conn, we might not want to cache the discovery doc forever -- maybe refresh after 1h?
- rpc.Conn should be goroutine-safe, so discoveryDocument access should be protected with a mutex
- in federation,
conn.chooseBackend(conn.cluster.ClusterID)
should be justconn.local
(both in the new code and in the VocabularyGet method above which it was probably copied from)
Updated by Peter Amstutz about 1 year ago
Updated by Peter Amstutz about 1 year ago
- Status changed from In Progress to Resolved