Idea #11167
closed
[Workbench] Remove arv-get file download fallback
Added by Peter Amstutz almost 8 years ago.
Updated over 7 years ago.
Description
Currently, if keep-web isn't configured, Workbench will fallback to arv-get and Rails streaming to serve files. However, this is a very bad fallback:
- It fails silently if calling arv-get doesn't work
- It ties up a workbench worker for the duration of the download
- It doesn't report content-length, so user agents are unable to render a progress bar or determine if the entire file was transferred.
- It sometimes silently drops out in the middle of downloads
- It sometimes consumes huge amount of RAM, crashing the workbench server.
- It can't handle [some?] range requests
Instead we should:
- Workbench should refuse start if keep-web is not configured <- see note#22
- Documentation should be updated to emphasize that keep-web is mandatory, Workbench config for keep-web, and the new Workbench startup failure mode
- Remove the arv-get fallback code and adjust any related tests
- For file downloads, prefer to link directly to keep-web instead of redirecting through workbench (especially useful for sharing links) (done in #8784)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Target version set to 2017-03-15 sprint
- Story points set to 1.0
- Target version changed from 2017-03-15 sprint to Arvados Future Sprints
- Description updated (diff)
- Assigned To set to Lucas Di Pentima
- Target version changed from Arvados Future Sprints to 2017-07-19 sprint
- Description updated (diff)
- Status changed from New to In Progress
- Target version changed from 2017-07-19 sprint to 2017-08-02 sprint
I didn't spend much time looking at the updates, but here are a couple initial comments.
- collections_controller -> show file method: Since workbench won’t start if keep-web is not enabled, I don’t think you need this check anymore: “if Rails.configuration.keep_web_url or Rails.configuration.keep_web_download_url”. I think you can remove this if statement and outdent the code block here?
- please revisit the integration tests that are removed and amend them to work with keep-web instead (unless a keep-web counterpart test already exists in that context). The test 'view log via keep-web redirect' in jobs_test.rb might come handy in this direction.
Thanks.
Lucas, a couple more comments in the meantime:
- application.default.yml => can you add “.com” to the test config as follows?
- keep_web_url: http://example/c=%{uuid_or_pdh}
+ keep_web_url: http://example.com/c=%{uuid_or_pdh}
- Would it be possible to refactor the following code (similar code in anonymous_access_test and jobs_test as well) into a reusable method?
+ use_keep_web_config
+
+ token = api_fixture('api_client_authorizations')['active']['api_token']
+ data = "foo\nfile\n"
+ datablock = `echo -n #{data.shellescape} | ARVADOS_API_TOKEN=#{token.shellescape} arv-put --no-progress --raw -`.strip
+ assert $?.success?, $?
+
+ col = nil
+ use_token 'active' do
+ mtxt = ". #{datablock} 0:#{data.length}:foo\n"
+ col = Collection.create(manifest_text: mtxt)
+ end
- Target version changed from 2017-08-02 sprint to 2017-08-16 sprint
Updates at 57647945b08a74bab02fed6adf947eb9adb8321f
Test run: https://ci.curoverse.com/job/developer-run-tests/407/
- Addressed comments from Radhika regarding
keep_web_url
setting and integration test reusable method
- Did a lot of testing on why "
view partial job log
" integration test fails, and discovered a bug on keep-web
in the process, fixed.
- The
keep-web
fix isn't enough because there's another bug on our testing environment, seems that the javascript code on the failing integration test doesn't get all the http response headers from keep-web
, there are 2 suspects:
- One is
nginx
because it's being used as a keep-web
proxy when running tests.
- Another possibility is the headless browser being used for integration tests the one losing some headers.
- The partial log view feature was tested on
arvbox
and worked correctly when trying it with the keep-web
fix.
Next action will be follow Tom Clegg's suggestion and try using curl
from within the integration test to see if the missing headers appear, because I've been trying different nginx
configurations without any success.
IIRC this change means "include ActionController::Live" isn't needed any more in CollectionsController.
LGTM, thanks!
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:d0a4adabafa4d132ab2333338c941acc57ca82fb.
- Description updated (diff)
"workbench should refuse start if keep-web is not configured" <- that's a bad idea. we have all sort of restarts and different configurations that the simple "should refuse to start" will make a nightmare in operations, is better a big-fat-red-sign on the workbench that the workbench not starting at all.
- Related to Bug #17528: [Rails] package prints a backtrace on installation when the arvados config.yml file is not present/complete added
Also available in: Atom
PDF