Bug #17150
closedSystemRootToken with non-alphanumeric characters silently fails
Description
SystemRootToken accepts only alphanumeric characters or it fails to communicate with controller. Tests show that:
- "changeme_system_root_token_changeme_system": fails
- "change" passes
- "badtoken00badtoken00badtoken00badtoken00b" passes
- changemeasystemarootatokenachangemeasystechangemeasystemarootatokenachangemeasyste" passes
Original report, for reference:
pvanheus in gitter mentioned having issues with a fresh install of Arvados using the formula. His original issue (missing crunch-dispatch-local
's config file) was solved adding the file, but evidenced an issue using the SystemRootToken
.
I could reproduce the issue in a freshly installed Arvados cluster:
Relevant section of Arvados' config.yml
Clusters: arva2: SystemRootToken: "changeme_system_root_token" ManagementToken: "changeme_management_token"
Fails¶
Using these parameters:
export ARVADOS_API_HOST=arva2.arv.local:443 export ARVADOS_API_HOST_INSECURE=true export ARVADOS_API_TOKEN=changeme_system_root_token
I can't query Arvados
root@arva2:~# curl -q\ --insecure \ -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \ https://$ARVADOS_API_HOST/arvados/v1/collections
I get
{"errors":["Not logged in (req-zdh06tyffy1r1qmipp33)"],"error_token":"1605793145+18c3488a"}
In
/var/www/arvados-api/current/log/production.log
{"method":"GET","path":"/arvados/v1/collections","format":"*/*","controller":"Arvados::V1::CollectionsController","action":"index","status":401,"duration":1.18,"view":0.22,"db":0.0,"request_id":"req-zhstydjocopp1ddt7nj7","client_ipaddr":"127.0.0.1","client_auth":null,"params":{},"@timestamp":"2020-11-19T13:41:48.493440858Z","@version":"1","message":"[401] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)"}
and in
journalctl -f
Nov 19 13:41:48 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-zhstydjocopp1ddt7nj7","level":"info","msg":"request","remoteAddr":"127.0.0.1:33138","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","time":"2020-11-19T13:41:48.487492882Z"} Nov 19 13:41:48 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-zhstydjocopp1ddt7nj7","level":"info","msg":"response","remoteAddr":"127.0.0.1:33138","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","respBody":"{\"errors\":[\"Not logged in (req-zhstydjocopp1ddt7nj7)\"],\"error_token\":\"1605793308+2a226ffd\"}","respBytes":91,"respStatus":"Unauthorized","respStatusCode":401,"time":"2020-11-19T13:41:48.494689236Z","timeToStatus":0.006851,"timeTotal":0.007199,"timeWriteBody":0.000348}
Passes¶
If I change the credentials to the ones I get in Workbench's UI, the query succeeeds:
export ARVADOS_API_TOKEN=v2/arva2-gj3su-sxldaz37f8h28un/szqnavhypc5wit9k3xrjhlen3ewj504w3f4heb3qc50c6qh6w root@arva2:~# curl -q\ --insecure \ -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \ https://$ARVADOS_API_HOST/arvados/v1/collections
I get
{"kind":"arvados#collectionList","etag":"","self_link":"","offset":0,"limit":100,"items":[{"href":"/collections/arva2-4zz18-8cykjg5gyxfiv4f","kind":"arvados#collection","etag":"pspmj2vb9voo69q6qy87k7wt","uuid":"arva2-4zz18-8cykjg5gyxfiv4f","owner_uuid":"arva2-tpzed-000000000000000","created_at":"2020-11-19T10:59:40.534588000Z","modified_by_client_uuid":null,"modified_by_user_uuid":"arva2-tpzed-000000000000000","modified_at":"2020-11-19T10:59:40.534969000Z","name":"empty collection","description":null,"properties":{},"portable_data_hash":"d41d8cd98f00b204e9800998ecf8427e+0","replication_desired":null,"replication_confirmed":null,"replication_confirmed_at":null,"storage_classes_desired":["default"],"storage_classes_confirmed":[],"storage_classes_confirmed_at":null,"delete_at":null,"trash_at":null,"is_trashed":false,"version":1,"current_version_uuid":"arva2-4zz18-8cykjg5gyxfiv4f","preserve_version":false,"file_count":0,"file_size_total":0}],"items_available":1}
In
/var/www/arvados-api/current/log/production.log
{"method":"GET","path":"/arvados/v1/collections","format":"*/*","controller":"Arvados::V1::CollectionsController","action":"index","status":200,"duration":9.31,"view":0.27,"db":3.5,"request_id":"req-1opkrw7usorjm1cj50q6","client_ipaddr":"127.0.0.1","client_auth":"arva2-gj3su-sxldaz37f8h28un","params":{},"@timestamp":"2020-11-19T13:45:05.088265297Z","@version":"1","message":"[200] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)"}
and in
journalctl -f
Nov 19 13:45:05 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-1opkrw7usorjm1cj50q6","level":"info","msg":"request","remoteAddr":"127.0.0.1:33144","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","time":"2020-11-19T13:45:05.068032762Z"} Nov 19 13:45:05 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-1opkrw7usorjm1cj50q6","level":"info","msg":"response","remoteAddr":"127.0.0.1:33144","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","respBytes":972,"respStatus":"OK","respStatusCode":200,"time":"2020-11-19T13:45:05.089663139Z","timeToStatus":0.021186,"timeTotal":0.021624,"timeWriteBody":0.000438}
Updated by Javier Bértoli about 4 years ago
Lucas suggested testing directly on the Rails API. The failure and pass situation are exactly the same:
# export ARVADOS_API_TOKEN=changeme_system_root_token # curl -q --insecure \ -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \ http://127.0.0.2:8004/arvados/v1/collections {"errors":["Not logged in (req-68wahompepcj7rqe551b)"],"error_token":"1605795377+19e9c858"} # export ARVADOS_API_TOKEN=v2/arva2-gj3su-sxldaz37f8h28un/szqnavhypc5wit9k3xrjhlen3ewj504w3f4heb3qc50c6qh6w # curl -q --insecure \ -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \ http://127.0.0.2:8004/arvados/v1/collections {"kind":"arvados#collectionList","etag":"","self_link":"","offset":0,"limit":100,"items":[{"href":"/collections/arva2-4zz18-8cykjg5gyxfiv4f","kind":"arvados#collection","etag":"pspmj2vb9voo69q6qy87k7wt","uuid":"arva2-4zz18-8cykjg5gyxfiv4f","owner_uuid":"arva2-tpzed-000000000000000","created_at":"2020-11-19T10:59:40.534588000Z","modified_by_client_uuid":null,"modified_by_user_uuid":"arva2-tpzed-000000000000000","modified_at":"2020-11-19T10:59:40.534969000Z","name":"empty collection","description":null,"properties":{},"portable_data_hash":"d41d8cd98f00b204e9800998ecf8427e+0","replication_desired":null,"replication_confirmed":null,"replication_confirmed_at":null,"storage_classes_desired":["default"],"storage_classes_confirmed":[],"storage_classes_confirmed_at":null,"delete_at":null,"trash_at":null,"is_trashed":false,"version":1,"current_version_uuid":"arva2-4zz18-8cykjg5gyxfiv4f","preserve_version":false,"file_count":0,"file_size_total":0}],"items_available":1}
Updated by Javier Bértoli about 4 years ago
- Subject changed from crunch-dispatch-local get's connection refused using SystemRootToken (and probably any tool using it) to SystemRootToken with non-alphanumeric characters silently fails
Nico suggested changing the length of the SystemRootToken to "badtoken00badtoken00badtoken00badtoken00b":
1. Edited the config.yml
file,
2. Restarted the services (systemctl restart arvados-controller
, systemctl restart nginx
)
3. exported the new TOKEN
export ARVADOS_API_TOKEN=badtoken00badtoken00badtoken00badtoken00b
And it passes OK in all the tests above.
To make other tests:
1. Changed the token to "changeme_system_root_token_changeme_syste" and it failed again.
2. Changed the token to "changemeasystemarootatokenachangemeasystechangemeasystemarootatokenachangemeasyste" and it passed.
3. Changed the token to "change" and it passed.
So it seems the issue is with non-alpha chars.
Updated by Lucas Di Pentima about 4 years ago
For the record:
The problem is that the (now old) auth header validation regexp being used on railsAPI was /(OAuth2|Bearer) ([-\/a-zA-Z0-9]+)/
, that doesn't include the underscore character, and from that match, the 2nd group was being used to get the token. This means that the token got truncated from the first underscore onwards.
Current code from #16669 already fixes that, but the story isn't currently fully resolved yet so it wasn't included on the latest release.
Updated by Javier Bértoli about 4 years ago
- % Done changed from 0 to 100
- Target version set to 2020-12-02 Sprint
- Assigned To set to Lucas Di Pentima
- Status changed from New to Feedback
- Category set to Tests
- added @pvanheus fix
- added a debug parameter to the provision script
- added a test script to verify the cluster is able to run a CWL workflow (not enabled by default, can be enabled in Vagrantfile or run independently)
- document salt-install's workflow test
Updated by Javier Bértoli about 4 years ago
- Related to Task #17146: [salt][provision] Modify the script to run salt-call in info mode instead of debug mode added
Updated by Lucas Di Pentima about 4 years ago
- Assigned To changed from Lucas Di Pentima to Javier Bértoli
Updated by Lucas Di Pentima about 4 years ago
- Status changed from Feedback to In Progress
Updated by Lucas Di Pentima about 4 years ago
- When trying the vagrant flavor, the cwl run-test.sh tries to create an ‘admin’ user that already exists, so it fails. I think this is because I logged in as ‘admin’ on workbench before running the cwl test script. I suppose the script could create a user with a random name?
- The Vagrant file changed the vm hostname to “vagrant.local” from “arva2.arv.local”, but instructions on the documentation haven’t changed, I’m not sure if that would produce problems but wanted to mention it just in case.
- If the CWL test files are just a copy from the some on
sdk/cwl/
, I think it would be convenient to symlink them so they get updated with the originals. - I’ve re-created the VM without attempting to use the “admin” user on workbench, and test passed OK.
Updated by Javier Bértoli about 4 years ago
Lucas Di Pentima wrote:
- When trying the vagrant flavor, the cwl run-test.sh tries to create an ‘admin’ user that already exists, so it fails. I think this is because I logged in as ‘admin’ on workbench before running the cwl test script. I suppose the script could create a user with a random name?
At first I did that, however, as Arvados is being installed with "first user that logs in is the cluster's admin" option, if you run the test script before login into the UI, the 'admin' user won't be granted admin powers anymore :D
It's kind of a chicken/egg issue. Probably I can try one of these 2 options to the script:
- add another user, make sure 'admin' is 'admin'
- add 'admin', make sure it does not fail if exists (probably the same with some of the other steps)
- The Vagrant file changed the vm hostname to “vagrant.local” from “arva2.arv.local”, but instructions on the documentation haven’t changed, I’m not sure if that would produce problems but wanted to mention it just in case.
Yes, I needed to do that because Vagrant sets some FQDNs in /etc/hosts
and using 'arva2.arv.local' as the Vagrant hostname adds it to the file with IP 127.x.x.x which confuses the docker containers resolver.
That's why I also changed all the 127.0.0.2
entries to <role>.internal
, so it's clear everywhere what's expected.
The change should not be a problem for the cluster.
- If the CWL test files are just a copy from the some on
sdk/cwl/
, I think it would be convenient to symlink them so they get updated with the originals.
Not sure, @Nico César gave them to me. I'll check and do as you suggest if that's the case.
- I’ve re-created the VM without attempting to use the “admin” user on workbench, and test passed OK.
Great. I'll merge and push the Saltstack PR and work on the changes you suggested. Thanks
Updated by Javier Bértoli about 4 years ago
- Merged the salt formula
- Fixed the test script to:
- add the user defined in the
provision.sh
script (defaults toadmin
)
- verify if the resources already exist (user, token, project), so the script can be run repeatedly without errors.
- add the user defined in the
- the CWL code is not part of the any previous CWL already available in the repo, so I left it as is.
commit e9eaf87ec@arvados, branch 17150-system-root-token
Updated by Lucas Di Pentima about 4 years ago
I tried starting a new vagrant instance and this happened:
- Error message at the end of creating a new vagrant instance: arvados:
usermod: group 'docker' does not exist
(not sure if it was happening before) - Error when running run-tests script:
Arvados project uuid is 'arva2-j7d0g-u2mosxf9jtnom17' Uploading arvados/jobs' docker image to the project Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/json: dial unix /var/run/docker.sock: connect: permission denied Traceback (most recent call last): File "/usr/share/python3/dist/python3-arvados-python-client/lib/python3.7/site-packages/arvados/commands/keepdocker.py", line 136, in docker_images next(list_output) # Ignore the header line StopIteration
Reloading the instance didn't fixed the problem
Updated by Javier Bértoli about 4 years ago
There was a change in a dependency (docker formula was upgraded). Fixed it, commit 11c4d6f00@arvados
Updated by Javier Bértoli about 4 years ago
- Status changed from In Progress to Resolved
Merged commit 1dcd10a37@arvados.