Bug #17150

SystemRootToken with non-alphanumeric characters silently fails

Added by Javier Bértoli 11 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Tests
Target version:
Start date:
11/24/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

SystemRootToken accepts only alphanumeric characters or it fails to communicate with controller. Tests show that:

  • "changeme_system_root_token_changeme_system": fails
  • "change" passes
  • "badtoken00badtoken00badtoken00badtoken00b" passes
  • changemeasystemarootatokenachangemeasystechangemeasystemarootatokenachangemeasyste" passes

Original report, for reference:

pvanheus in gitter mentioned having issues with a fresh install of Arvados using the formula. His original issue (missing crunch-dispatch-local's config file) was solved adding the file, but evidenced an issue using the SystemRootToken.

I could reproduce the issue in a freshly installed Arvados cluster:

Relevant section of Arvados' config.yml

Clusters:
  arva2:

    SystemRootToken: "changeme_system_root_token" 
    ManagementToken: "changeme_management_token" 

Fails

Using these parameters:

export ARVADOS_API_HOST=arva2.arv.local:443
export ARVADOS_API_HOST_INSECURE=true
export ARVADOS_API_TOKEN=changeme_system_root_token

I can't query Arvados

root@arva2:~# curl -q\
   --insecure \
   -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \
  https://$ARVADOS_API_HOST/arvados/v1/collections

I get
{"errors":["Not logged in (req-zdh06tyffy1r1qmipp33)"],"error_token":"1605793145+18c3488a"}

In /var/www/arvados-api/current/log/production.log
{"method":"GET","path":"/arvados/v1/collections","format":"*/*","controller":"Arvados::V1::CollectionsController","action":"index","status":401,"duration":1.18,"view":0.22,"db":0.0,"request_id":"req-zhstydjocopp1ddt7nj7","client_ipaddr":"127.0.0.1","client_auth":null,"params":{},"@timestamp":"2020-11-19T13:41:48.493440858Z","@version":"1","message":"[401] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)"}

and in journalctl -f
Nov 19 13:41:48 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-zhstydjocopp1ddt7nj7","level":"info","msg":"request","remoteAddr":"127.0.0.1:33138","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","time":"2020-11-19T13:41:48.487492882Z"}
Nov 19 13:41:48 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-zhstydjocopp1ddt7nj7","level":"info","msg":"response","remoteAddr":"127.0.0.1:33138","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","respBody":"{\"errors\":[\"Not logged in (req-zhstydjocopp1ddt7nj7)\"],\"error_token\":\"1605793308+2a226ffd\"}","respBytes":91,"respStatus":"Unauthorized","respStatusCode":401,"time":"2020-11-19T13:41:48.494689236Z","timeToStatus":0.006851,"timeTotal":0.007199,"timeWriteBody":0.000348}

Passes

If I change the credentials to the ones I get in Workbench's UI, the query succeeeds:

export ARVADOS_API_TOKEN=v2/arva2-gj3su-sxldaz37f8h28un/szqnavhypc5wit9k3xrjhlen3ewj504w3f4heb3qc50c6qh6w

root@arva2:~# curl -q\
   --insecure \
   -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \
  https://$ARVADOS_API_HOST/arvados/v1/collections

I get
{"kind":"arvados#collectionList","etag":"","self_link":"","offset":0,"limit":100,"items":[{"href":"/collections/arva2-4zz18-8cykjg5gyxfiv4f","kind":"arvados#collection","etag":"pspmj2vb9voo69q6qy87k7wt","uuid":"arva2-4zz18-8cykjg5gyxfiv4f","owner_uuid":"arva2-tpzed-000000000000000","created_at":"2020-11-19T10:59:40.534588000Z","modified_by_client_uuid":null,"modified_by_user_uuid":"arva2-tpzed-000000000000000","modified_at":"2020-11-19T10:59:40.534969000Z","name":"empty collection","description":null,"properties":{},"portable_data_hash":"d41d8cd98f00b204e9800998ecf8427e+0","replication_desired":null,"replication_confirmed":null,"replication_confirmed_at":null,"storage_classes_desired":["default"],"storage_classes_confirmed":[],"storage_classes_confirmed_at":null,"delete_at":null,"trash_at":null,"is_trashed":false,"version":1,"current_version_uuid":"arva2-4zz18-8cykjg5gyxfiv4f","preserve_version":false,"file_count":0,"file_size_total":0}],"items_available":1}

In /var/www/arvados-api/current/log/production.log
{"method":"GET","path":"/arvados/v1/collections","format":"*/*","controller":"Arvados::V1::CollectionsController","action":"index","status":200,"duration":9.31,"view":0.27,"db":3.5,"request_id":"req-1opkrw7usorjm1cj50q6","client_ipaddr":"127.0.0.1","client_auth":"arva2-gj3su-sxldaz37f8h28un","params":{},"@timestamp":"2020-11-19T13:45:05.088265297Z","@version":"1","message":"[200] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)"}

and in journalctl -f
Nov 19 13:45:05 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-1opkrw7usorjm1cj50q6","level":"info","msg":"request","remoteAddr":"127.0.0.1:33144","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","time":"2020-11-19T13:45:05.068032762Z"}
Nov 19 13:45:05 arva2 arvados-controller[1070]: {"PID":1070,"RequestID":"req-1opkrw7usorjm1cj50q6","level":"info","msg":"response","remoteAddr":"127.0.0.1:33144","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"arva2.arv.local","reqMethod":"GET","reqPath":"arvados/v1/collections","reqQuery":"","respBytes":972,"respStatus":"OK","respStatusCode":200,"time":"2020-11-19T13:45:05.089663139Z","timeToStatus":0.021186,"timeTotal":0.021624,"timeWriteBody":0.000438}


Subtasks

Task #17164: Review 17150-system-root-tokenResolvedJavier Bértoli


Related issues

Related to Arvados - Task #17146: [salt][provision] Modify the script to run salt-call in info mode instead of debug modeResolved11/24/2020

Associated revisions

Revision d3e5a70e (diff)
Added by Peter van Heusden 11 months ago

Fix salt-install's crunch-dispatch-local config and tests

refs #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision cc9b9dac (diff)
Added by Peter van Heusden 11 months ago

Fix salt-install's crunch-dispatch-local config and tests

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision e9eaf87e (diff)
Added by Javier Bértoli 11 months ago

Fix salt-install and test scripts

  • add checks to find existing resources
  • run some shellcheck

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision 76928023 (diff)
Added by Javier Bértoli 11 months ago

fix(docker): formula upgraded

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision 11c4d6f0 (diff)
Added by Javier Bértoli 11 months ago

fix(test): wrong variable comparison

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision 1dcd10a3
Added by Javier Bértoli 11 months ago

Merge branch '17150-system-root-token'

closes #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision e916ddcd (diff)
Added by Peter van Heusden 10 months ago

Fix salt-install's crunch-dispatch-local config and tests

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision 33ef9263 (diff)
Added by Javier Bértoli 10 months ago

Fix salt-install and test scripts

  • add checks to find existing resources
  • run some shellcheck

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision e4062664 (diff)
Added by Javier Bértoli 10 months ago

fix(docker): formula upgraded

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

Revision 3b6a6446 (diff)
Added by Javier Bértoli 10 months ago

fix(test): wrong variable comparison

refs #17146, #17147, #17150

Arvados-DCO-1.1-Signed-off-by: Javier Bértoli <>

History

#1 Updated by Javier Bértoli 11 months ago

  • Description updated (diff)

#2 Updated by Javier Bértoli 11 months ago

Lucas suggested testing directly on the Rails API. The failure and pass situation are exactly the same:

# export ARVADOS_API_TOKEN=changeme_system_root_token

# curl -q --insecure \
  -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \
  http://127.0.0.2:8004/arvados/v1/collections

{"errors":["Not logged in (req-68wahompepcj7rqe551b)"],"error_token":"1605795377+19e9c858"}

# export ARVADOS_API_TOKEN=v2/arva2-gj3su-sxldaz37f8h28un/szqnavhypc5wit9k3xrjhlen3ewj504w3f4heb3qc50c6qh6w

# curl -q --insecure \
  -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \
  http://127.0.0.2:8004/arvados/v1/collections

{"kind":"arvados#collectionList","etag":"","self_link":"","offset":0,"limit":100,"items":[{"href":"/collections/arva2-4zz18-8cykjg5gyxfiv4f","kind":"arvados#collection","etag":"pspmj2vb9voo69q6qy87k7wt","uuid":"arva2-4zz18-8cykjg5gyxfiv4f","owner_uuid":"arva2-tpzed-000000000000000","created_at":"2020-11-19T10:59:40.534588000Z","modified_by_client_uuid":null,"modified_by_user_uuid":"arva2-tpzed-000000000000000","modified_at":"2020-11-19T10:59:40.534969000Z","name":"empty collection","description":null,"properties":{},"portable_data_hash":"d41d8cd98f00b204e9800998ecf8427e+0","replication_desired":null,"replication_confirmed":null,"replication_confirmed_at":null,"storage_classes_desired":["default"],"storage_classes_confirmed":[],"storage_classes_confirmed_at":null,"delete_at":null,"trash_at":null,"is_trashed":false,"version":1,"current_version_uuid":"arva2-4zz18-8cykjg5gyxfiv4f","preserve_version":false,"file_count":0,"file_size_total":0}],"items_available":1}

#3 Updated by Javier Bértoli 11 months ago

  • Subject changed from crunch-dispatch-local get's connection refused using SystemRootToken (and probably any tool using it) to SystemRootToken with non-alphanumeric characters silently fails

Nico suggested changing the length of the SystemRootToken to "badtoken00badtoken00badtoken00badtoken00b":

1. Edited the config.yml file,
2. Restarted the services (systemctl restart arvados-controller, systemctl restart nginx)
3. exported the new TOKEN

export ARVADOS_API_TOKEN=badtoken00badtoken00badtoken00badtoken00b

And it passes OK in all the tests above.

To make other tests:

1. Changed the token to "changeme_system_root_token_changeme_syste" and it failed again.
2. Changed the token to "changemeasystemarootatokenachangemeasystechangemeasystemarootatokenachangemeasyste" and it passed.
3. Changed the token to "change" and it passed.

So it seems the issue is with non-alpha chars.

#4 Updated by Javier Bértoli 11 months ago

  • Description updated (diff)

#5 Updated by Lucas Di Pentima 11 months ago

For the record:

The problem is that the (now old) auth header validation regexp being used on railsAPI was /(OAuth2|Bearer) ([-\/a-zA-Z0-9]+)/, that doesn't include the underscore character, and from that match, the 2nd group was being used to get the token. This means that the token got truncated from the first underscore onwards.

Current code from #16669 already fixes that, but the story isn't currently fully resolved yet so it wasn't included on the latest release.

#6 Updated by Javier Bértoli 11 months ago

  • % Done changed from 0 to 100
  • Target version set to 2020-12-02 Sprint
  • Assigned To set to Lucas Di Pentima
  • Status changed from New to Feedback
  • Category set to Tests

#7 Updated by Javier Bértoli 11 months ago

  • Related to Task #17146: [salt][provision] Modify the script to run salt-call in info mode instead of debug mode added

#8 Updated by Lucas Di Pentima 11 months ago

  • Assigned To changed from Lucas Di Pentima to Javier Bértoli

#9 Updated by Lucas Di Pentima 11 months ago

  • Status changed from Feedback to In Progress

#10 Updated by Lucas Di Pentima 11 months ago

  • When trying the vagrant flavor, the cwl run-test.sh tries to create an ‘admin’ user that already exists, so it fails. I think this is because I logged in as ‘admin’ on workbench before running the cwl test script. I suppose the script could create a user with a random name?
  • The Vagrant file changed the vm hostname to “vagrant.local” from “arva2.arv.local”, but instructions on the documentation haven’t changed, I’m not sure if that would produce problems but wanted to mention it just in case.
  • If the CWL test files are just a copy from the some on sdk/cwl/, I think it would be convenient to symlink them so they get updated with the originals.
  • I’ve re-created the VM without attempting to use the “admin” user on workbench, and test passed OK.

#11 Updated by Javier Bértoli 11 months ago

Lucas Di Pentima wrote:

  • When trying the vagrant flavor, the cwl run-test.sh tries to create an ‘admin’ user that already exists, so it fails. I think this is because I logged in as ‘admin’ on workbench before running the cwl test script. I suppose the script could create a user with a random name?

At first I did that, however, as Arvados is being installed with "first user that logs in is the cluster's admin" option, if you run the test script before login into the UI, the 'admin' user won't be granted admin powers anymore :D

It's kind of a chicken/egg issue. Probably I can try one of these 2 options to the script:

  • add another user, make sure 'admin' is 'admin'
  • add 'admin', make sure it does not fail if exists (probably the same with some of the other steps)
  • The Vagrant file changed the vm hostname to “vagrant.local” from “arva2.arv.local”, but instructions on the documentation haven’t changed, I’m not sure if that would produce problems but wanted to mention it just in case.

Yes, I needed to do that because Vagrant sets some FQDNs in /etc/hosts and using 'arva2.arv.local' as the Vagrant hostname adds it to the file with IP 127.x.x.x which confuses the docker containers resolver.

That's why I also changed all the 127.0.0.2 entries to <role>.internal, so it's clear everywhere what's expected.

The change should not be a problem for the cluster.

  • If the CWL test files are just a copy from the some on sdk/cwl/, I think it would be convenient to symlink them so they get updated with the originals.

Not sure, @Nico gave them to me. I'll check and do as you suggest if that's the case.

  • I’ve re-created the VM without attempting to use the “admin” user on workbench, and test passed OK.

Great. I'll merge and push the Saltstack PR and work on the changes you suggested. Thanks

#12 Updated by Javier Bértoli 11 months ago

  • Merged the salt formula
  • Fixed the test script to:
    • add the user defined in the provision.sh script (defaults to admin)
    • verify if the resources already exist (user, token, project), so the script can be run repeatedly without errors.
  • the CWL code is not part of the any previous CWL already available in the repo, so I left it as is.

commit e9eaf87ec@arvados, branch 17150-system-root-token

#13 Updated by Lucas Di Pentima 11 months ago

I tried starting a new vagrant instance and this happened:

  • Error message at the end of creating a new vagrant instance: arvados: usermod: group 'docker' does not exist (not sure if it was happening before)
  • Error when running run-tests script:
Arvados project uuid is 'arva2-j7d0g-u2mosxf9jtnom17'
Uploading arvados/jobs' docker image to the project
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/json: dial unix /var/run/docker.sock: connect: permission denied
Traceback (most recent call last):
  File "/usr/share/python3/dist/python3-arvados-python-client/lib/python3.7/site-packages/arvados/commands/keepdocker.py", line 136, in docker_images
    next(list_output)  # Ignore the header line
StopIteration

Reloading the instance didn't fixed the problem

#14 Updated by Javier Bértoli 11 months ago

There was a change in a dependency (docker formula was upgraded). Fixed it, commit 11c4d6f00@arvados

#15 Updated by Lucas Di Pentima 11 months ago

LGTM, thanks!

#16 Updated by Javier Bértoli 11 months ago

  • Status changed from In Progress to Resolved

Merged commit 1dcd10a37@arvados.

#17 Updated by Peter Amstutz 5 months ago

  • Release set to 38

Also available in: Atom PDF