Keep-web flow

(Note: this page is an alternate explanation of the authoritative docs at https://godoc.org/git.curoverse.com/arvados.git/services/keep-web#hdr-Authorization_mechanisms)

Keep-web serves files from Keep collections as normal HTTP documents.

Considerations:

We are serving arbitrary files, which can include HTML files with Javascript. We don't want serve these files as regular documents from Workbench, because this would expose a cross-site-scripting (XSS) attack where the HTML page is loaded and executed with the credentials of the viewing user.

This is mitigated two ways:

  1. By serving files with "content-disposition: attachment" which tells the browser to open up the download dialog straight away and don't try to show the files.
  2. Using separate a separate domain for downloading, so the browser won't send workbench cookies.

This raises the challenge: how to provide the API token to keep-web to enable download?

keepweb accepts an API token the following ways:

  • With Authorization: OAuth2 header
  • With Authorization: Basic header
  • With ?api_token=xxx query string
  • With a cookie called arvados_api_token
  • With /t=xxx/ at the start of the path
Constraints:
  1. when doing a GET request, the API token must be either part of the request URI or a header (browser does not send the workbench cookie when keep-web is on a different domain)
  2. We want to hide the API token from the user unless it is a "sharing" link