arv-federation-migrate script issues
We've received report of this script having problems with the following:
- Dry run mode fails because of undefined variable(s)
- New users are created without first/last names, that makes it difficult to share with
- Feature request: Could the migrated collections/projects NOT be placed on a special "Migrated from..." subproject?
#1 Updated by Lucas Di Pentima 3 months ago
Updates at bfdc7fac1 - branch
Test run: https://ci.arvados.org/job/arv-federation-migrate-test/80/
- Fixes dry-run mode.
- Adds command-line argument to avoid migrating data to a special subproject.
- Makes new user creation take the first & last names of the account being migrated.
- Adds test checks for first+last names migration.
#3 Updated by Peter Amstutz 2 months ago
A couple thoughts.
Per feedback the message "Not migrating %s because user is admin but target user %s is not admin on %s" should probably also include the instructions "please ensure the user admin status is the same on both clusters"
I'm thinking instead of a flag "--avoid-subproject-creation" that should just move things to the user as the default. If there is a name collision we should make sure the error clearly says what happened and explains what the admin needs to do about it. We could try to detect name collisions through the API but it would be kind of slow. I don't really want to write that code unless we have to.
#4 Updated by Peter Amstutz 2 months ago
Here's the error you get if you try to merge a users and there's a collision reassigning ownership:
Error: request failed: http://localhost:8004/arvados/v1/users/merge: 422 Unprocessable Entity: #<ActiveRecord::RecordNotUnique: PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "index_collections_on_owner_uuid_and_name" DETAIL: Key (owner_uuid, name)=(x20vs-tpzed-hhh7bidyjo43cm6, Collection ABC) already exists. : UPDATE "collections" SET "owner_uuid" = 'x20vs-tpzed-hhh7bidyjo43cm6' WHERE "collections"."owner_uuid" = $1> (req-16loxhmfybacdpz8tlph)
#5 Updated by Peter Amstutz 2 months ago
How about doing a regex match on the error message, if we can match on something like:
Key \(owner_uuid, name\)=\((:uuid pattern:), (.*)\) already exists. UPDATE "(.*)"
Then we can provide a more useful error message saying "both source and target have a $2 named $3, please rename them".
Although, that will be annoying if there's more than one collision.
Alternately, if there's a collision, it could rename one of them automatically. Or fallback on creating a group.
If we assume this is unlikely to happen, then the least effort solution would be to simply provide a useful error if it does happen.
#6 Updated by Lucas Di Pentima 2 months ago
From chat: We will make the migrate script to attempt migrating data to the target's home project and if there's an error, do a regex-match to see if the problem is about name collisions. If it is, give a meaningful error message to the admin and provide a command line flag to make the migrate script to fallback to migrate into a subproject.
#7 Updated by Lucas Di Pentima 2 months ago
Updates at 5147b5515
Test run: https://ci.arvados.org/job/arv-federation-migrate-test/84/
- Changes the default behaviour to migrate user's data directly to the target account's home project instead of a new subproject.
- Provides a command line flag
--data-into-subprojectthat allows the admin to fallback to the previous behaviour if needed.
- Detects naming collision errors and explains why the user account can't be migrated, offering the previously mentioned flag as a way of solving the problem.
- Expands the explanation message when source & target accounts differ in their admin status, noting that a federated admin user has admin powers on the entire federation.
#8 Updated by Peter Amstutz 2 months ago
Let's tweak this message slightly:
(%s) Target owner %s already has a %s named '%s', skipping. Please rename it or use --data-into-subproject to migrate all users' data into a special subproject." % (email, target_owner, rsc_type[:-1], rsc_name))
(%s) Cannot migrate to %s because both origin and target users have a %s named '%s'. Please rename the conflicting items or use --data-into-subproject to migrate all users' data into a special subproject." % (email, target_owner, rsc_type[:-1], rsc_name))
#9 Updated by Lucas Di Pentima 2 months ago
Updates at 81c2dc65f
- Updates error message as requested on note-8.
Manual test proof:
(venv-arvados) lucas@buster:~/arvados$ arv-federation-migrate --migrate users.csv Checking that the federation is well connected Getting user list from x1stu Getting user list from x2b3l Getting user list from x3vnm (email@example.com) Migrating x3vnm-tpzed-cxi6su8x5yw3lc3 to x2b3l-tpzed-7kw5oe4pum34b74 on x3vnm (firstname.lastname@example.org) Cannot migrate to x2b3l-tpzed-7kw5oe4pum34b74 because both origin and target users have a group named 'New project'. Please rename the conflicting items or use --data-into-subproject to migrate all users' data into a special subproject.