Project

General

Profile

Actions

Feature #20029

closed

WB2 service object for manipulating files

Added by Peter Amstutz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench2
Story points:
2.0

Description

Support code for Workbench collection operations using the replace_files API (See: https://doc.arvados.org/v2.5/api/methods/collections.html#replace_files).

This should be part of the collection service class in wb2.

  • Batch delete a list of files
  • Batch copy list of files or directories from a collection to a new path in same or other collection
  • Batch move list of files or directories from a collection to a new path in same or other collection
  • Create new empty placeholder directory
  • Write unit tests

Subtasks 1 (0 open1 closed)

Task #20111: Review 20029-collection-batch-file-operationsResolvedStephen Smith03/15/2023Actions

Related issues

Blocks Arvados - Feature #20030: Use new service object for existing operationsResolvedStephen Smith03/17/2023Actions
Actions #1

Updated by Peter Amstutz about 1 year ago

  • Story points set to 2.0
  • Description updated (diff)
Actions #2

Updated by Peter Amstutz about 1 year ago

  • Blocks Feature #20030: Use new service object for existing operations added
Actions #3

Updated by Peter Amstutz about 1 year ago

  • Target version changed from Future to To be scheduled
Actions #4

Updated by Peter Amstutz about 1 year ago

  • Target version changed from To be scheduled to 2023-03-01 sprint
Actions #5

Updated by Peter Amstutz about 1 year ago

  • Assigned To set to Stephen Smith
Actions #6

Updated by Stephen Smith about 1 year ago

  • Status changed from New to In Progress
Actions #7

Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2023-03-01 sprint to Development 2023-03-15 sprint
Actions #8

Updated by Tom Clegg about 1 year ago

For creating an empty dir, would this work?

"replace_files": {
  "/new/dir/name": "d41d8cd98f00b204e9800998ecf8427e+0/" 
}
Actions #9

Updated by Tom Clegg about 1 year ago

...or, the webdav way would be MKCOL: http://www.webdav.org/specs/rfc4918.html#METHOD_MKCOL

Actions #10

Updated by Peter Amstutz about 1 year ago

  • Target version changed from Development 2023-03-15 sprint to Development 2023-03-29 Sprint
Actions #11

Updated by Stephen Smith about 1 year ago

Changes at arvados-workbench2|4d78e6e875392cdcadad164cc3863a552343833f
Tests developer-tests-workbench2: #1121

  • Added batchFile Delete/Move/Copy and createDirectory methods
  • Added unit tests to verify replace_files calls
Actions #12

Updated by Lucas Di Pentima about 1 year ago

Some comments:

  • For the case of "renaming a file" (that is, a move operation within the same collection), the batchFileMove() method is not atomic, but the replace_files API allows it (see doc page's file rename example)
  • Related to the previous comment, the batchFileMove() could also just replace the moveFile() method. AFAICT, there's only one user of it (renameFile() from arvados-workbench2/src/store/collection-panel/collection-panel-files/collection-panel-files-actions.ts) so it would be easy to just make it a moveFiles() (plural) version, wdyt?
  • The batchFileDelete() method I think should replace the already existing (and unreliable) deleteFiles() method.
  • The createDirectory() method I think would be nice to be backed by the replace_files API too, for consistency's sake. I'm thinking that if we keep the majority of the file operations within the more "cheap" API, it would provide a more reliable service to the user. The replace_files API talks directly to the API and supports passing preserve_version: true, something that webdav doesn't support (see for example how moveFile() need to make a second call to preserve the version)
  • And if you agree with all of the above, the batchFileCopy() method could be renamed to copyFiles() to have naming consistency with the rest of the methods. I think the 'batch' part of the name is not important.
Actions #13

Updated by Stephen Smith about 1 year ago

  • For the case of moving within the same collection I can add a case to do that atomically, but as I understand it, moving between collections needs a second call to replace_files since the copy and delete operations making up the move in replace_files would have different destination collections, and the file targets are relative to the root of the collection UUID passed in the URL so only 1 collection can be worked on at a time
  • As far as replacing the existing methods, I was planning on doing that in the next ticket for this
  • I don't think it's possible to create an empty directory with replace_files unless you first find a collection with an empty directory to use as a source, so this is why I decided to use the webdav MKCOL method
Actions #14

Updated by Lucas Di Pentima about 1 year ago

Stephen Smith wrote in #note-13:

  • For the case of moving within the same collection I can add a case to do that atomically, but as I understand it, moving between collections needs a second call to replace_files since the copy and delete operations making up the move in replace_files would have different destination collections, and the file targets are relative to the root of the collection UUID passed in the URL so only 1 collection can be worked on at a time

Yeah, the ticket's description ask for both cases, and the "same collection" case could be easily optimized to be an atomic op. It's sad that we can't do the same when dealing with different collections, but at least the partial failure case is not destructive.

  • As far as replacing the existing methods, I was planning on doing that in the next ticket for this

It seems to me to be wasteful effort to create methods and then rename/rewrite them on a different branch, but if you think it's better that way, I won't argue :)

  • I don't think it's possible to create an empty directory with replace_files unless you first find a collection with an empty directory to use as a source, so this is why I decided to use the webdav MKCOL method

From chat: I think we can use the empty collection's PDH for creating empty directories.

Actions #15

Updated by Stephen Smith about 1 year ago

Changes at arvados-workbench2|9d2d11ffcf56ab3075fac66ac75d91448b4d8ea6
Tests developer-tests-workbench2: #1125

  • Did some reorganization
  • Added case for moving within a single collection to perform the action in a single request
  • I went ahead and renamed the operations and replaced the existing deleteFiles implementation
  • Also added additional renameFile to be used for single file renames since moveFiles is for moving multiple files to a new path without changing the filenames, and I don't think it's necessary to have the moveFiles method allow moving and changing the name of multiple files at the same time, and doing so would make the API more complicated when the only case for renaming files is currently just renaming a single file
  • Fixed bugs in switching existing uses to new methods
  • Consolidated & updated the collection unit tests
Actions #16

Updated by Lucas Di Pentima about 1 year ago

Two small suggestions:

  • I think the test called 'should remove files with uuid prefix' can be removed because seems to be testing something specific about webdav URLs
  • Nit: the emptyCollectionUuid constant refers to a PDH instead of a UUID.

Other than that, it LGTM. Thanks!

Actions #17

Updated by Stephen Smith about 1 year ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF