Project

General

Profile

Actions

Bug #4520

closed

[SDKs] arv copy should respect --project-uuid when saving destination Collection

Added by Sarah Guthrie over 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
SDKs
Target version:
Story points:
0.5

Description

arv copy --src settings --dst 9tee4 --project-uuid 9tee4-j7d0g-f75lcwo78ifvnl2 qr1hi-4zz18-85b713kdm9kle3r
(where settings pointed to qr1hi configuration file on lightning-dev3)
saved the new collection in the Home project instead of the project pointed to by the project-uuid.

The same behavior occurred when copying a 100MB collection and a pipeline template.

If --project-uuid only applies to pipeline instances, better -h documentation would be helpful


Subtasks 2 (0 open2 closed)

Task #5238: Review 4520-arv-copy-project-uuidResolvedPeter Amstutz11/13/2014Actions
Task #5114: FixResolvedPeter Amstutz11/13/2014Actions

Related issues

Has duplicate Arvados - Bug #5294: [SDKs] Make arv-copy useful for copying a Docker imageClosed02/23/2015Actions
Actions #1

Updated by Tim Pierce over 9 years ago

  • Category set to SDKs
  • Target version set to Bug Triage
Actions #2

Updated by Tim Pierce over 9 years ago

  • Subject changed from arv copy did not pay attention to --project-uuid given to [SDKs] arv copy did not pay attention to --project-uuid given
Actions #3

Updated by Brett Smith over 9 years ago

  • Subject changed from [SDKs] arv copy did not pay attention to --project-uuid given to [SDKs] arv copy does not respect --project-uuid when copying Collection

This might be related to #3847.

Actions #4

Updated by Tom Clegg over 9 years ago

  • Subject changed from [SDKs] arv copy does not respect --project-uuid when copying Collection to [SDKs] arv copy should respect --project-uuid when saving destination Collection
  • Story points set to 0.5
Actions #5

Updated by Tom Clegg over 9 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints
Actions #6

Updated by Tom Clegg over 9 years ago

  • Target version changed from Arvados Future Sprints to 2015-02-18 sprint
Actions #7

Updated by Radhika Chippada over 9 years ago

  • Assigned To set to Radhika Chippada
Actions #8

Updated by Radhika Chippada about 9 years ago

  • Assigned To deleted (Radhika Chippada)
Actions #9

Updated by Peter Amstutz about 9 years ago

  • Assigned To set to Peter Amstutz
Actions #10

Updated by Peter Amstutz about 9 years ago

  • Assigned To deleted (Peter Amstutz)
Actions #11

Updated by Peter Amstutz about 9 years ago

  • Assigned To set to Peter Amstutz
Actions #12

Updated by Peter Amstutz about 9 years ago

Notes:

It should now correctly respect --project-uuid for collections, templates, pipeline instances, and docker image metadata.

If a collection already exists on the target system, it will create a new collection in the target project using the existing manifest_text from the existing collection.

Actions #13

Updated by Peter Amstutz about 9 years ago

  • Status changed from New to In Progress
Actions #14

Updated by Peter Amstutz about 9 years ago

  • Target version changed from 2015-02-18 sprint to 2015-03-11 sprint
Actions #15

Updated by Brett Smith about 9 years ago

Reviewing 74784b3.

  • The code to take a source collection and turn it into a collection on dst (including necessary munging) seems long enough to be worth DRYing up at this point.
  • Why did the log level of the "Skipping message" collection upgrade from DEBUG to INFO? It seems like this could occasionally make the program noticeably more verbose, and I'm not sure how it helps the user to know this.

While I was reading this I noticed some other bugs in the code you're touching. If you're up for fixing them, I'd say go for it. If not, we should file separate bugs about them.

  • The collection create calls don't respect args.retries.
  • The docker_image_hash link is copied incorrectly. The name should be Docker's native "image ID," not the portable data hash of the corresponding collection.

Thanks.

Actions #16

Updated by Peter Amstutz about 9 years ago

Brett Smith wrote:

Reviewing 74784b3.

  • The code to take a source collection and turn it into a collection on dst (including necessary munging) seems long enough to be worth DRYing up at this point.

Done.

  • Why did the log level of the "Skipping message" collection upgrade from DEBUG to INFO? It seems like this could occasionally make the program noticeably more verbose, and I'm not sure how it helps the user to know this.

Took that out entirely and revamped the copying the logic to actually check the owner_uuid, name and portable_data_hash of existing collection at the destination before deciding whether to make a copy.

While I was reading this I noticed some other bugs in the code you're touching. If you're up for fixing them, I'd say go for it. If not, we should file separate bugs about them.

  • The collection create calls don't respect args.retries.

Fixed.

  • The docker_image_hash link is copied incorrectly. The name should be Docker's native "image ID," not the portable data hash of the corresponding collection.

Fixed. Also looks for (and copies) Docker metadata any time it is copying a collection and not only when copying pipelines.

Bonus assertion added to keep.py because it turns out arv-copy was passing unicode strings to KeepClient.put() (whoops!)

Now at a47dc2b

Actions #17

Updated by Brett Smith about 9 years ago

Reviewing a47dc2b

Peter Amstutz wrote:

  • The docker_image_hash link is copied incorrectly. The name should be Docker's native "image ID," not the portable data hash of the corresponding collection.

Fixed. Also looks for (and copies) Docker metadata any time it is copying a collection and not only when copying pipelines.

Thanks, that basically covers #5294.

Bonus assertion added to keep.py because it turns out arv-copy was passing unicode strings to KeepClient.put() (whoops!)

Please change this condition to isinstance(data, str) to accept subclasses, and add a test.

The new line args.project_uuid = dst_arv.users().current().execute()["uuid"] should respect args.retries.

Thanks.

Actions #18

Updated by Peter Amstutz about 9 years ago

Brett Smith wrote:

Please change this condition to isinstance(data, str) to accept subclasses, and add a test.

Fixed.

The new line args.project_uuid = dst_arv.users().current().execute()["uuid"] should respect args.retries.

Fixed.

Now at 7b0d393

Actions #19

Updated by Brett Smith about 9 years ago

Peter Amstutz wrote:

Brett Smith wrote:

Please change this condition to isinstance(data, str) to accept subclasses, and add a test.

Fixed.

The new line args.project_uuid = dst_arv.users().current().execute()["uuid"] should respect args.retries.

Fixed.

Both of these changes are good. Unfortunately, the change to make CollectionWriter.manifest_text() encode the return string isn't going to fly. The entire interface to CollectionWriter is already bytes-based, so users need to pass in encoded strings for filenames already. For anybody already passing in non-ASCII, this change will basically attempt a double-encode and break:

>>> import arvados
>>> cw = arvados.CollectionWriter()
>>> with cw.open(u'♥'.encode('utf-8')) as cf:
...     pass
...
>>> cw.manifest_text()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "arvados/collection.py", line 588, in manifest_text
    return manifest.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 41: ordinal not in range(128)

If you're okay reverting that change, I think this is good to merge. If there's some nuance I'm missing, let's talk about it. Thanks.

Actions #20

Updated by Peter Amstutz about 9 years ago

  • Don't encode manifest_text by default to avoid double-encode faults.
  • Coerce unicode strings to ascii in put(). Will raise an error if it is not a valid conversion.
  • Use result.content (returns literal result bytes) instead result.text (returns unicode) when processing Keep responses.
  • Updated test.
Actions #21

Updated by Brett Smith about 9 years ago

Peter Amstutz wrote:

  • Updated test.

Please assert that the first put returns the right hash (so we're sure it encodes correctly), and ditch the noop foo_locator assignments here. Then go ahead and merge. Thanks.

Actions #22

Updated by Peter Amstutz about 9 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:f0fe7273c1851cb93e9edd58c0b60d3590b222ed.

Actions

Also available in: Atom PDF