Bug #5790

[SDKs] arv-copy does not select the correct Docker image to copy

Added by Brett Smith over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Brett Smith
Category:
SDKs
Target version:
Start date:
04/22/2015
Due date:
% Done:

100%

Estimated time:
(Total: 2.00 h)
Story points:
1.0

Description

A user just reported a bug where arv-copy was failing trying to copy a nonexistent collection UUID while copying a pipeline template. This collection was a Docker image that had since been deleted and replaced. Docker metadata links were still hanging around pointing to this collection, and apparently arv-copy followed them to try to copy it instead of the replacement collection.

Given this scenario, and that arv-copy worked after deleting the links, it seems likely that arv-copy is not using the appropriate metadata to decide which image to copy (the same logic in API server's Collection.find_all_for_docker_image).


Subtasks

Task #6103: Review 5790-copy-most-recent-docker-image-wipResolvedBrett Smith

Task #6107: Fix Docker image search/listResolvedBrett Smith

Associated revisions

Revision e8ae9364
Added by Brett Smith about 4 years ago

Merge branch '5790-copy-most-recent-docker-image-wip'

Closes #5790, #6103.

Revision 679ea52a (diff)
Added by Brett Smith about 4 years ago

5790: Fix PySDK Docker image listing comparing ints and datetimes.

Closes #5790.

History

#1 Updated by Brett Smith over 4 years ago

  • Target version changed from Bug Triage to 2015-06-10 sprint

#2 Updated by Tom Clegg over 4 years ago

  • Story points set to 1.0

#3 Updated by Peter Amstutz over 4 years ago

  • Assigned To set to Peter Amstutz

#4 Updated by Brett Smith over 4 years ago

  • Assigned To changed from Peter Amstutz to Brett Smith

#5 Updated by Brett Smith about 4 years ago

5790-copy-most-recent-docker-image-wip is up for review. It ends up revamping the Docker image list function in arv-keepdocker, because that's what arv-copy calls to find what it should copy. I ended up fixing several bugs while I was here. Relevant to arv-copy:

  • The original sorting bug: the code previously used created_at a sort key when image_timestamp was not available. This did not match the API server's behavior, which is to follow links with any image_timestamp before ones without.
  • Also like API server, ignore links that point to unknown collections.
  • Always fetch all Docker links, not just the first batch returned by the API server.
  • Docker images can be referred to by Docker image hash, so I added that search functionality to the list function.

Some listing improvements:

  • Show image hashes when listing images by name.
  • Like Docker itself, when an image has multiple names and we're not filtering by name, list each one.

#6 Updated by Brett Smith about 4 years ago

  • Status changed from New to In Progress

#7 Updated by Nico C├ęsar about 4 years ago

I finally checked and tested out.

This is a comparison of the output from master and the branch :

$ arv-keepdocker                                                                                                       
REPOSITORY                      TAG         IMAGE ID      COLLECTION                     CREATED                                                      
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa  latest      d8309758b8fe  zzzzz-4zz18-1v45jub259sjjgb    Tue Jun 10 14:30:00 2014   
arvados/apitestfixture          latest      d8309758b8fe  zzzzz-4zz18-t68oksiu9m80s4y    Thu Jun 10 14:30:00 2010
$ arv-keepdocker                                                                                                       
REPOSITORY                      TAG         IMAGE ID      COLLECTION                     CREATED                                                      
arvados/apitestfixture          latest      d8309758b8fe  zzzzz-4zz18-1v45jub259sjjgb    Tue Jun 10 14:30:00 2014                                     
arvados/apitestfixture          june10      d8309758b8fe  zzzzz-4zz18-1v45jub259sjjgb    Tue Jun 10 14:30:00 2014                                     
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa  latest      d8309758b8fe  zzzzz-4zz18-1v45jub259sjjgb    Tue Jun 10 14:30:00 2014   
arvados/apitestfixture          latest      d8309758b8fe  zzzzz-4zz18-t68oksiu9m80s4y    Thu Jun 10 14:30:00 2010

this looks good to me. merge it!

#8 Updated by Brett Smith about 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 50 to 100

Applied in changeset arvados|commit:e8ae9364dce380d305833f35dfb25578175664d7.

#9 Updated by Ward Vandewege about 4 years ago

  • Status changed from Resolved to In Progress

Looks like this change broke arv keep docker on 4xphq:

shell.4xphq:~$ arv keep docker
REPOSITORY                      TAG         IMAGE ID      COLLECTION                     CREATED             
Traceback (most recent call last):
  File "/usr/local/bin/arv-keepdocker", line 4, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/arvados/commands/keepdocker.py", line 291, in main
    for i, j in list_images_in_arv(api, args.retries):
  File "/usr/local/lib/python2.7/dist-packages/arvados/commands/keepdocker.py", line 241, in list_images_in_arv
    filters=search_filters + [['link_class', '=', 'docker_image_hash']])
  File "/usr/local/lib/python2.7/dist-packages/arvados/commands/keepdocker.py", line 187, in _get_docker_links
    links.sort(key=itemgetter('_sort_key'), reverse=True)
TypeError: can't compare datetime.datetime to int

If I comment out line 187 (and also line 274 where the same sort is done), then it works again.

#10 Updated by Brett Smith about 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:679ea52a117414fc33768aac887615f9b701fdf8.

Also available in: Atom PDF