Bug #2937

Collections download page trips up wget on subdirectories

Added by Brett Smith about 6 years ago. Updated about 6 years ago.

Assigned To:
Brett Smith
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:


Last sprint, we made a collections download page that's meant to be wget-friendly: it requires no authorization, and it doesn't link to any resources other than the files in the Collection. This is #2764.

Unfortunately, we just discovered that the page is not friendly when the collection contains files in directories. wget is not automatically creating destination subdirectories, but it does attempt to save files under them, which fails:

--2014-05-29 17:34:50--  https://workbench.4xphq.arvadosapi.com/collections/download/3bcb4a087ce4f1db3126b81204f16eef+92/5dvoynjlty21p41ts0yno9g0izrins8delmexuh9wuvndvhcmw/testcoll/alice.txt
Reusing existing connection to workbench.4xphq.arvadosapi.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
5dvoynjlty21p41ts0yno9g0izrins8delmexuh9wuvndvhcmw/testcoll: Not a directory5dvoynjlty21p41ts0yno9g0izrins8delmexuh9wuvndvhcmw/testcoll/alice.txt: Not a directory

Cannot write to `5dvoynjlty21p41ts0yno9g0izrins8delmexuh9wuvndvhcmw/testcoll/alice.txt' (Not a directory).

We need to figure out a solution to this. So far there doesn't seem to be a wget command-line switch that will do it. Based on a mirror of my personal site, it seems like wget does make the directories if they're linked to as directories, so making empty links on the page that do that will solve the issue. Maybe they can 404, so we trick wget into making the directories without saving anything else that would clutter the Collection download? This needs more investagation/testing.

  1. Offer a sensible "subdirectory view" at /collections/download/{uuid}/{token}/foo/, i.e., showing links to only files whose path starts with ./foo/
  2. At any given level, show contents of the collection/subtree like this:
    • foo/
    • foo/bar.txt
    • foo/baz/
    • foo/baz/waz.txt
    • ...

Associated revisions

Revision 7e723d29 (diff)
Added by Brett Smith about 6 years ago

2937: Make sure Collection share links end with /.

This is necessary to prevent wget from saving a plain HTML file with
the name of the reader token. With a file there, it's not possible to
make a directory tree for recursive downloads.

Closes #2937.


#1 Updated by Tom Clegg about 6 years ago

  • Target version set to 2014-07-16 Sprint

#2 Updated by Tom Clegg about 6 years ago

  • Story points set to 1.0

#3 Updated by Tom Clegg about 6 years ago

  • Description updated (diff)

#4 Updated by Brett Smith about 6 years ago

  • Assigned To set to Brett Smith

#5 Updated by Brett Smith about 6 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:7e723d29f206fcc5ba4d8a26f8d547a72d5f0425.

Also available in: Atom PDF