Project

General

Profile

Actions

Feature #22076

open

keep-web can create a zipfile on the fly of a collection

Added by Peter Amstutz 7 months ago. Updated about 10 hours ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
-

Description

Accessed by making a POST request to the root of the (WebDAV endpoint) for the collection on keep-web. Works by PDH or UUID.

Should work the same whether using the "inline" or "download only" endpoint. Must be the collection root, not a subdirectory.

Indicate that it should be a zipfile by providing the header Accept: application/zip (confirm that is the right MIME type).

The POST body is either empty (get the whole collection), or a JSON array of strings which are paths within the collection to be included in the zip.

These are files or directories, if a path goes to a directory then it gets the entire contents of that directory. If there is both a reference to a subdirectory and to a specific file within that subdirectory, it gets the whole subdirectory (the file reference is redundant).

The list of files/directories should be sorted so they always download in the same order.

The zip file should be streamed to avoid excessive copying or use of staging storage.

If any of the file paths requested do not exist in the collection, return an error.

Check with customer

We probably do not need to support Range requests, this will be confirmed with customer.

We probably don't need to compress the files, but need to check.

Consider including the ".arvados#collection" file in the zip with the collection metadata.

Answers from customer (Feb 3)

  • We probably do not need to support Range requests, this will be confirmed with customer.

Answer: no, the intended use case is people downloading using a browser, which typically don't implement resumable downloads, which would be the main reason to implement Range requests.

  • We probably don't need to compress the files, but need to check.

Convinced them that there are tradeoffs and isn't necessary for the initial implementation, enabling compression could be a future improvement.

  • Consider including the ".arvados#collection" file in the zip with the collection metadata.

Agreed that the usefulness of including metadata about the Arvados collection outweighs the risk of end user confusion.

I noted it probably should not be called exactly ".arvados#collection" since that is a special file that is used by some tools to detect directories backed by arv-mount. We should think about what to call it instead -- maybe "collection.json" ?


Subtasks 1 (1 open0 closed)

Task #22642: ReviewNewBrett SmithActions
Actions

Also available in: Atom PDF