Feature #14716

[keep-web] Use cluster config file

Added by Peter Amstutz 10 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
08/08/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Subtasks

Task #15391: Review 14716-webdav-cluster-configResolvedLucas Di Pentima

Task #15543: Review 14716-doc-fixesResolvedLucas Di Pentima


Related issues

Related to Arvados - Story #13648: [Epic] Use one cluster configuration file for all componentsNew

Associated revisions

Revision 1ea75125
Added by Tom Clegg 4 months ago

Merge branch '14716-anonymous-token'

refs #14716

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

Revision 0fb04af5
Added by Lucas Di Pentima 3 months ago

Merge branch '14716-webdav-cluster-config'
Refs #14716

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

Revision b6499675
Added by Lucas Di Pentima 3 months ago

Merge branch '14716-doc-fixes'
Closes #14716

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

History

#1 Updated by Peter Amstutz 10 months ago

  • Related to Story #13648: [Epic] Use one cluster configuration file for all components added

#2 Updated by Tom Morris 10 months ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 1.0

#3 Updated by Tom Morris 5 months ago

  • Target version changed from Arvados Future Sprints to 2019-07-03 Sprint

#4 Updated by Lucas Di Pentima 5 months ago

  • Assigned To set to Lucas Di Pentima

#5 Updated by Lucas Di Pentima 5 months ago

  • Status changed from New to In Progress

#6 Updated by Lucas Di Pentima 5 months ago

  • Target version changed from 2019-07-03 Sprint to 2019-07-17 Sprint

#7 Updated by Lucas Di Pentima 4 months ago

  • Target version changed from 2019-07-17 Sprint to 2019-07-31 Sprint

#8 Updated by Lucas Di Pentima 4 months ago

  • Target version changed from 2019-07-31 Sprint to 2019-08-14 Sprint

#9 Updated by Lucas Di Pentima 4 months ago

Updates at 2c7b49e87 - branch 14716-webdav-cluster-config
Test run: https://ci.curoverse.com/job/developer-run-tests/1448/

There's a keep-web integration test that's failing and I'm not being able to tell why. It's the one testing access to a collection using /collections/uuid/path. Supposedly the anonymous token is used on that case ignoring all other provided credentials, I made sure that the anonymous user token is configured, however but it still fails.
Some Workbench integration tests also fail, and they're related to the download feature so maybe I'm seeing the same issue from 2 different points of view.

#10 Updated by Tom Clegg 4 months ago

It looks like the test was relying on a bug. The "/collections/$uuid/path" form is only supposed to use the configured anonymous tokens, and ignore caller-supplied tokens. But when no anonymous tokens were configured (as in this test), keep-web was incorrectly using the supplied credentials.

14716-anonymous-token @ f89544af7f3d38bd61b4216527d66897eb08dcd0 -- https://ci.curoverse.com/view/Developer/job/developer-run-tests/1451/
  • don't use supplied credentials at /collections/$uuid/path, even if no anon tokens are configured
  • fix test → expect 404

#11 Updated by Lucas Di Pentima 3 months ago

Updates at a33badcdc - branch 14716-webdav-cluster-config
Test run: https://ci.curoverse.com/job/developer-run-tests/1459/

  • Merged Tom's fix so now keep-web tests pass OK
  • Updated arvbox to make keep-web work
  • Migrated deprecated NodeProfiles on arvbox cluster config

There're some workbench integration tests still failing. I'm not being able to run them correctly on my local dev VM (even on master, they fail in other ways) so not sure how to debug this. I think what's happening is a configuration issue between keep-web and workbench of some kind. Already tried the new keep-web on arvbox and seem to work fine.

#12 Updated by Lucas Di Pentima 3 months ago

The workbench integration tests that are failing are for example (from https://ci.curoverse.com/job/developer-run-tests-apps-workbench-integration/1533/console):

13:55:05  56) Failure:
13:55:05 DownloadTest#test_download_anonymous_content_from_keep-web_by_portable_data_hash [/tmp/workspace/developer-run-tests-apps-workbench-integration/apps/workbench/test/integration/download_test.rb:84]:
13:55:05 --- expected
13:55:05 +++ actual
13:55:05 @@ -1,2 +1 @@
13:55:05 -"Hello world
13:55:05 -" 
13:55:05 +nil
13:55:05 
13:55:05 
13:55:05  57) Failure:
13:55:05 DownloadTest#test_download_anonymous_content_from_keep-web_by_uuid [/tmp/workspace/developer-run-tests-apps-workbench-integration/apps/workbench/test/integration/download_test.rb:84]:
13:55:05 --- expected
13:55:05 +++ actual
13:55:05 @@ -1,2 +1 @@
13:55:05 -"Hello world
13:55:05 -" 
13:55:05 +nil
13:55:05 
13:55:05 
13:55:05  58) Failure:
13:55:05 DownloadTest#test_download_from_keep-web_using_a_session_token [/tmp/workspace/developer-run-tests-apps-workbench-integration/apps/workbench/test/integration/download_test.rb:84]:
13:55:05 Expected: "w a z" 
13:55:05   Actual: nil

Checking the keep-web.log from the artifacts (https://ci.curoverse.com/job/developer-run-tests-apps-workbench-integration/1533/artifact/tmp/keep-web.log) I'm seeing that all requests that mention the 'Hello World.txt' and 'w a z' files were served correctly, that's why I'm thinking there has to be a problem in the interaction between keep-web and workbench of some sort.

#13 Updated by Tom Clegg 3 months ago

config docs

Set Services.Controller.Insecure: true if your API server's TLS certificate is not signed by a recognized CA.

should be TLS.Insecure: true

"http://:9002/": {}

should be "http://keep_web_hostname_goes_here:9002/": {}

exec sudo -u nobody keep-web -config=/path/to/arvados.yml

I wonder if we need a note that arvados.yml needs to be world-readable?

I wonder if we should drop -config=/path/to/arvados.yml from the install docs and just rely on the default, since we already instructed the reader to use the default location.

to your Workbench cluster configuration file

remove Workbench

ExternalURL: "https://download.<span class="userinput">uuid_prefix</span>.your.domain/c=%{uuid_or_pdh}"

This shouldn't have the %{uuid_or_pdh} stuff any more -- Workbench now adds the /c= stuff (or puts the UUID in the host part) depending on whether the URL has a wildcard.

ExternalURL: "https://%{uuid_or_pdh}--collections.<span class="userinput">uuid_prefix</span>.your.domain"

Likewise these should be "https://*--collections.uuid_prefix.your.domain" etc.

code

AnonymousUserToken:

Looks like you added a second copy of this key to config.default.yml?

(ldr *Loader) LoadDefaults()

I think I already rejected this in #14717#note-10. You can do this instead:

cfg, err := config.NewLoader(bytes.NewBufferString("Clusters: {zzzzz: {}}"), nil).Load()

If you want a helper function for that, it should be in arvadostest -- cluster zzzzz is only for testing.

yaml "gopkg.in/yaml.v2"

should use ghodss/yaml like everywhere else

DefaultConfig()

this seems to have changed purpose -- rename to newConfig(*arvados.Config) and update comment?

TestLegacyConfig

This is in services/keep-web, but seems to be testing code in lib/config -- maybe it should move to lib/config?

#14 Updated by Tom Clegg 3 months ago

this seems to fix the workbench tests: 050ea7fdc6317a0fa0eeed20b0e6cb0b7fd6693b

#15 Updated by Lucas Di Pentima 3 months ago

Now that we have set up an unified anonymous token for the workbench integration tests, I'm getting this failure:

14:44:28   9) Failure:
14:44:28 JobsTest#test_view_job_with_components_as_job_reader2_user [/tmp/workspace/developer-run-tests-apps-workbench-integration/apps/workbench/test/integration/jobs_test.rb:138]:
14:44:28 Expected false to be truthy.

I think this is related to https://github.com/curoverse/arvados/blob/master/apps/workbench/app/controllers/jobs_controller.rb#L6-L9 , which makes workbench ignore the provided credentials and use the anonymous token, making the test fail because it's seeing some public project job. We're hours away to be removing the jobs API but I'm not sure if this test will get removed. I think this is a bug that may be fixed by adding a check to the skip_around_action that no credentials are provided in addition to having an anonymous token set?

#16 Updated by Lucas Di Pentima 3 months ago

Updates at 8b873a9b3
Test run: https://ci.curoverse.com/job/developer-run-tests/1474/

My previous comments on note-15 were incorrect, they were fixed by setting the anonymous token to an empty string (initially tried that but other random test failures made me think that wasn't the issue).

Addressed suggestions from note-13 and merge latest master.

#17 Updated by Lucas Di Pentima 3 months ago

Re-ran tests using the pre-master merge commit c8f57c52224362d7621f1271774b0f2d60c55cac
https://ci.curoverse.com/job/developer-run-tests/1475/ & https://ci.curoverse.com/job/developer-run-tests-apps-workbench-integration/1556/ (wb retry)

I'm investigating why go/sdk/keepclient tests fail... locally they don't.

#18 Updated by Lucas Di Pentima 3 months ago

Seems to be a false alarm, successful re-run at https://ci.curoverse.com/job/developer-run-tests-remainder/1537/

#19 Updated by Tom Clegg 3 months ago

Should remove AssertPathExists from keep-web.service (thanks to Eric for pointing this out on another cluster-config issue)

On install-keep-web.html, when confirming that the keep-web binary is installed -- especially now that keep-web's flags are just generic config-loading options -- showing the entire output of "keep-web -h" seems like a lot of noise. Maybe just "Usage of keep-web: [...]" would be enough.

The "set the cluster config file like the following" section seems misplaced between "install runit" and "create a run script for runit". It also seems unnecessary to divide the config updates into two phases. Maybe the last section should be renamed from "Tell Workbench about..." to "Configure keep-web", and should also include the config.yml changes that are currently in the "install" section.

Related pre-existing bug: The "install keep-web" page doesn't mention that the package installs a systemd service and you should skip the runit stuff if you have systemd. It should probably be updated to match keep-balance and dispatch-cloud ("option 1: systemd" / "option 2: runit").

#20 Updated by Lucas Di Pentima 3 months ago

Doc fixes at d83e0937b - branch 14716-doc-fixes

Following the above suggestions:

  • Removed generic config loading options from the "keep-web -h" output example
  • Moved the config file description to the last section
  • Renamed last section to "configure keep-web" & adjusted it a little to be more consistent
  • Added systemd section
  • Reordered sections to be: 1) install, 2) configure and finally 3) run.

#21 Updated by Tom Clegg 3 months ago

Given that the example "generate anonymous token" transcript produces "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz", the AnonymousUserToken in the immediately following example file should also be "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz".

There are no instructions for making a SystemRootToken and I don't think keep-web even uses it -- that line should probably be removed from the example config.

The "config file should be world-readable" note is misplaced. Come to think of it, since systemd runs keep-web as root, maybe we should remove the "sudo -u nobody" from the runit script to match systemd, and remove the "world-readable" note.

The rest LGTM.

#22 Updated by Lucas Di Pentima 3 months ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF