Project

General

Profile

Actions

Idea #11167

closed

[Workbench] Remove arv-get file download fallback

Added by Peter Amstutz about 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
07/24/2017
Due date:
Story points:
1.0

Description

Currently, if keep-web isn't configured, Workbench will fallback to arv-get and Rails streaming to serve files. However, this is a very bad fallback:

  1. It fails silently if calling arv-get doesn't work
  2. It ties up a workbench worker for the duration of the download
  3. It doesn't report content-length, so user agents are unable to render a progress bar or determine if the entire file was transferred.
  4. It sometimes silently drops out in the middle of downloads
  5. It sometimes consumes huge amount of RAM, crashing the workbench server.
  6. It can't handle [some?] range requests

Instead we should:
- Workbench should refuse start if keep-web is not configured <- see note#22
- Documentation should be updated to emphasize that keep-web is mandatory, Workbench config for keep-web, and the new Workbench startup failure mode
- Remove the arv-get fallback code and adjust any related tests
- For file downloads, prefer to link directly to keep-web instead of redirecting through workbench (especially useful for sharing links) (done in #8784)


Subtasks 1 (0 open1 closed)

Task #11937: Review 11167-wb-remove-arvgetResolvedLucas Di Pentima07/24/2017Actions

Related issues

Related to Arvados - Idea #8784: [Workbench] Use keep-web to generate directory listingsResolvedTom Clegg03/23/2016Actions
Related to Arvados - Bug #17528: [Rails] package prints a backtrace on installation when the arvados config.yml file is not present/completeResolvedTom Clegg04/15/2021Actions
Actions #1

Updated by Peter Amstutz about 7 years ago

  • Description updated (diff)
Actions #2

Updated by Tom Clegg about 7 years ago

  • Description updated (diff)
Actions #3

Updated by Tom Morris about 7 years ago

  • Description updated (diff)
  • Target version set to 2017-03-15 sprint
  • Story points set to 1.0
Actions #4

Updated by Tom Morris about 7 years ago

  • Target version changed from 2017-03-15 sprint to Arvados Future Sprints
Actions #5

Updated by Peter Amstutz almost 7 years ago

  • Description updated (diff)
Actions #6

Updated by Tom Morris almost 7 years ago

  • Assigned To set to Lucas Di Pentima
  • Target version changed from Arvados Future Sprints to 2017-07-19 sprint
Actions #7

Updated by Tom Clegg almost 7 years ago

  • Description updated (diff)
Actions #8

Updated by Lucas Di Pentima almost 7 years ago

  • Status changed from New to In Progress
Actions #9

Updated by Lucas Di Pentima almost 7 years ago

  • Target version changed from 2017-07-19 sprint to 2017-08-02 sprint
Actions #10

Updated by Lucas Di Pentima almost 7 years ago

Updates at cd698336f - branch 11167-wb-remove-arvget
Test run: https://ci.curoverse.com/job/developer-run-tests/391/

  • Removed arv-get related code
  • Adjusted/removed tests
  • Added code to refuse start when keep-web isn't configured

Pending: update documentation

Actions #11

Updated by Lucas Di Pentima almost 7 years ago

Some new updates at c7187ec72
Test run: https://ci.curoverse.com/job/developer-run-tests/392/

  • Removed outdated comments from the code
  • Corrected a test that wasn't failing when runnning them without configuring keep-web.
Actions #12

Updated by Lucas Di Pentima almost 7 years ago

Fixed a failing integration test at 8f3070e16
Test run: https://ci.curoverse.com/job/developer-run-tests/393/

Actions #13

Updated by Radhika Chippada almost 7 years ago

I didn't spend much time looking at the updates, but here are a couple initial comments.

  • collections_controller -> show file method: Since workbench won’t start if keep-web is not enabled, I don’t think you need this check anymore: “if Rails.configuration.keep_web_url or Rails.configuration.keep_web_download_url”. I think you can remove this if statement and outdent the code block here?
  • please revisit the integration tests that are removed and amend them to work with keep-web instead (unless a keep-web counterpart test already exists in that context). The test 'view log via keep-web redirect' in jobs_test.rb might come handy in this direction.

Thanks.

Actions #14

Updated by Lucas Di Pentima over 6 years ago

Updates at 2918ed81152a1fe8607f433e36dceddc1cb78f74
Test run: https://ci.curoverse.com/job/developer-run-tests/399/

Addressed almost all suggestions made on the previous comment. I'm still working to adjust jobs' integration test view partial job log.

Actions #15

Updated by Radhika Chippada over 6 years ago

Lucas, a couple more comments in the meantime:

  • application.default.yml => can you add “.com” to the test config as follows?
-  keep_web_url: http://example/c=%{uuid_or_pdh}
+  keep_web_url: http://example.com/c=%{uuid_or_pdh}
  • Would it be possible to refactor the following code (similar code in anonymous_access_test and jobs_test as well) into a reusable method?
+    use_keep_web_config
+
+    token = api_fixture('api_client_authorizations')['active']['api_token']
+    data = "foo\nfile\n" 
+    datablock = `echo -n #{data.shellescape} | ARVADOS_API_TOKEN=#{token.shellescape} arv-put --no-progress --raw -`.strip
+    assert $?.success?, $?
+
+    col = nil
+    use_token 'active' do
+      mtxt = ". #{datablock} 0:#{data.length}:foo\n" 
+      col = Collection.create(manifest_text: mtxt)
+    end
Actions #16

Updated by Lucas Di Pentima over 6 years ago

  • Target version changed from 2017-08-02 sprint to 2017-08-16 sprint
Actions #17

Updated by Lucas Di Pentima over 6 years ago

Updates at 57647945b08a74bab02fed6adf947eb9adb8321f
Test run: https://ci.curoverse.com/job/developer-run-tests/407/

  • Addressed comments from Radhika regarding keep_web_url setting and integration test reusable method
  • Did a lot of testing on why "view partial job log" integration test fails, and discovered a bug on keep-web in the process, fixed.
  • The keep-web fix isn't enough because there's another bug on our testing environment, seems that the javascript code on the failing integration test doesn't get all the http response headers from keep-web, there are 2 suspects:
    • One is nginx because it's being used as a keep-web proxy when running tests.
    • Another possibility is the headless browser being used for integration tests the one losing some headers.
  • The partial log view feature was tested on arvbox and worked correctly when trying it with the keep-web fix.

Next action will be follow Tom Clegg's suggestion and try using curl from within the integration test to see if the missing headers appear, because I've been trying different nginx configurations without any success.

Actions #18

Updated by Lucas Di Pentima over 6 years ago

Updates at 02a0ac50b
Test run: https://ci.curoverse.com/job/developer-run-tests/408/

Using curl from the integration tests proved that nginx wasn't being eating those expected response headers.
Finally found the cause of the failing integration test: PhantomJS 1.9.8 seems to have a bug, as reported here: https://github.com/ariya/phantomjs/issues/13200
Changing the test to use selenium fixed the issue.

Actions #19

Updated by Tom Clegg over 6 years ago

IIRC this change means "include ActionController::Live" isn't needed any more in CollectionsController.

LGTM, thanks!

Actions #20

Updated by Lucas Di Pentima over 6 years ago

I've tried removing it (3224524c1a2d6790322448565cb5a16ea0bcb874) and some tests started to fail: https://ci.curoverse.com/job/developer-run-tests/410/

Reverted that change at 36b4355956d23f3b7402fba0e18318fc70e31e27 and started a new test run (https://ci.curoverse.com/job/developer-run-tests/412/) to confirm all is good with the latest changes before merging to master.

Actions #21

Updated by Lucas Di Pentima over 6 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:d0a4adabafa4d132ab2333338c941acc57ca82fb.

Actions #22

Updated by Nico César over 6 years ago

  • Description updated (diff)

"workbench should refuse start if keep-web is not configured" <- that's a bad idea. we have all sort of restarts and different configurations that the simple "should refuse to start" will make a nightmare in operations, is better a big-fat-red-sign on the workbench that the workbench not starting at all.

Actions #23

Updated by Ward Vandewege about 3 years ago

  • Related to Bug #17528: [Rails] package prints a backtrace on installation when the arvados config.yml file is not present/complete added
Actions

Also available in: Atom PDF