[Workbench] Sharing link for public collection still renders after sharing token is revoked, but then downloaded files are 0 bytes
To replicate: Go to https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-7zk4muy5grnaqpv/4qji0cfumh25dttlwteo6rj2b83z2b8vz1l0rja3uzo82bf3s/ and right-click, download any file. File is 0 bytes.
Scenario: This was linked from https://github.com/nouyang-curoverse/GA4GH_regions and caused confusion as noted by @adamnovak here: https://github.com/nouyang-curoverse/GA4GH_regions/pull/2
Collection (A) in Arvados at
Sharing button is still "depressed".
Collection (A) was copied to a public project (B) so that I could share the collection in "anonymous" view. I probably clicked share/unshare a few times on the collection (A).
Workaround, in the meantime:
Public project (B): https://workbench.qr1hi.arvadosapi.com/collections/qr1hi-4zz18-7zk4muy5grnaqpv
Collection sharing link from project (B): https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-7zk4muy5grnaqpv/2sf3n5y9nuqhi815xioxa2ychioyqts7i5ysxg5we8fdl988az/
5915: Workbench tries the anonymous reader token first for collection wget listing.
This is necessary to make sure we provide a usable token to arv-get.
If we don't check the anonymous reader token first, we might decide
that another token is usable when in actuality, the reader token is
the one that worked. Closes #5915.
#1 Updated by Abram Connelly over 6 years ago
This is also referenced under the wiki https://github.com/ga4gh/schemas/wiki/Human-Genome-Variation-Map-%28HGVM%29-Pilot-Project#pilot-test-data.
#4 Updated by Brett Smith over 6 years ago
- Subject changed from files in public collection (via "share link") are 0 bytes to [Workbench] Sharing link for public collection still renders after sharing token is revoked, but then downloaded files are 0 bytes
- Category set to Workbench
- Status changed from New to In Progress
- Assigned To set to Brett Smith
- Target version changed from Bug Triage to 2015-05-20 sprint
I haven't proven it, but I'm pretty sure this is what happened:
- The first sharing link was created.
- The collection was publicly shared so people without Arvados accounts could see the full Workbench interface.
- The token from the first sharing link was revoked. (This is the part I haven't confirmed, since activity around API tokens is scarcely logged.)
Now I'm looking at
CollectionsController#show_file in Workbench. It calls
Collection.find with different API tokens to look for one that it can pass along to arv-get. However, we implemented public views by passing along the defined public token as a reader token with every API request. Because of this, when the collection is shared publicly,
Collection.find will always return a result. This will trick
show_file into thinking that any defined token is usable, even if that's not actually case. It will use that token when it calls arv-get, and that will fail.
The easy way to get the downloads working again would be to reverse the order that
show_file searches tokens for usability. This would check the public reader token for usability first: if that fails, we would be assured that it was not skewing the results in future checks. I'm going to make that change now.
This maybe raises an interesting question of whether Workbench should always fail when you make a request involving a sharing token that's been revoked. On the one hand, it might upset users to see a request still work after they revoked the link. On the other hand, it's not actually hurting anything from a strict security analysis: the collection is public, so the visitor can still get the data, they'd just have to try the token-less URL. I'm going to leave this question for another ticket.