Story #8568

[SDKs] `arv keep docker` supports Docker 1.10+

Added by Brett Smith over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
SDKs
Target version:
Start date:
02/26/2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

If you try to run arv keep docker today on a host running Docker 1.10+, you'll get an exception when the tool tries to find a specific file in the generated tar file:

$ arv keep docker --project-uuid=qr1hi-j7d0g-rerv400sudof0mn name
1474M / 1474M 100.0% 
Collection saved as 'Docker image name:latest sha256:... (2)'
qr1hi-4zz18-k4b79jr1uamnila
Traceback (most recent call last):
  File "/usr/local/bin/arv-keepdocker", line 5, in <module>
    pkg_resources.run_script('arvados-python-client==0.1.20160128214108', 'arv-keepdocker')
  File "/usr/local/lib/python2.7/dist-packages/pkg_resources.py", line 517, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/local/lib/python2.7/dist-packages/pkg_resources.py", line 1436, in run_script
    exec(code, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/arvados_python_client-0.1.20160128214108-py2.7.egg/EGG-INFO/scripts/arv-keepdocker", line 4, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/arvados_python_client-0.1.20160128214108-py2.7.egg/arvados/commands/keepdocker.py", line 401, in main
    json_file = image_tar.extractfile(image_tar.getmember(image_hash + '/json'))
  File "/usr/lib/python2.7/tarfile.py", line 1800, in getmember
    raise KeyError("filename %r not found" % name)
KeyError: "filename 'sha256:...d2b6/json' not found"

This happens because Docker changed the image tar format in Docker 1.10. Recognize and correctly upload images generated by Docker 1.10.

Changes to make

The output of docker images --no-trunc has changed so the image hash now includes the cryptographic hash name. It looks like sha256:4fe79ae514594715386c226e89a53c8037e081603f93f65a47c82573359f70cb.

Right now arv-keepdocker looks for a JSON metadata file in the .tar under <image hash>/json:

json_file = image_tar.extractfile(image_tar.getmember(image_hash + '/json'))

This needs to be updated to look for a file named <unmarked hash>.json. So, in the example hash above, arv-keepdocker needs to look for a metadata file named 4fe79ae514594715386c226e89a53c8037e081603f93f65a47c82573359f70cb.json.

Suggest finding the metadata file with something like the following:

image_hash_type, _, raw_image_hash = image_hash.rpartition(':')
if image_hash_type:
    json_filename = raw_image_hash + '.json'
else:
    json_filename = raw_image_hash + '/json'
json_file = image_tar.extractfile(image_tar.getmember(json_filename))

I have confirmed that the JSON file still has the information arv-keepdocker needs in the new image format.


Subtasks

Task #9641: Review branch 8568-docker-version-supportResolvedRadhika Chippada


Related issues

Related to Arvados - Support #9628: CWL Tutorial Does not RunResolved07/18/2016

Associated revisions

Revision be87361d
Added by Radhika Chippada about 5 years ago

closes #8568
Merge branch '8568-docker-version-support'

History

#1 Updated by Brett Smith over 5 years ago

  • Description updated (diff)

#2 Updated by Brett Smith about 5 years ago

  • Target version set to Arvados Future Sprints

#3 Updated by Brett Smith about 5 years ago

  • Description updated (diff)

#4 Updated by Brett Smith about 5 years ago

  • Description updated (diff)

#5 Updated by Radhika Chippada about 5 years ago

  • Assigned To set to Radhika Chippada
  • Story points set to 1.0

#6 Updated by Radhika Chippada about 5 years ago

  • Target version changed from Arvados Future Sprints to 2016-08-03 sprint

#7 Updated by Radhika Chippada about 5 years ago

Branch 8568-docker-version-support is ready for review.

Just did the replacement code copy paste from the description. Manually tested the code block is sane. There do not seem to be any existing tests in this area and hence I did not make any tests updates.

All python sdk tests passed.

#8 Updated by Radhika Chippada about 5 years ago

  • Status changed from New to In Progress

#9 Updated by Lucas Di Pentima about 5 years ago

Look good to me.

#10 Updated by Radhika Chippada about 5 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:be87361dedf4e35405616e802fba12dedf86dfde.

#11 Updated by Tom Morris about 5 years ago

Do we need test coverage here? Was test writing the reason it was estimated at a full story point?

Do we need to improve our process so that we test with Docker releases when they hit beta (or no later than immediately after they're released) or should we instead have a compatibility matrix which shows which versions we've tested with and are supported? Docker's such a critical component for the system that I think we need to have a good story here.

The current release is 1.11.2 and 1.12 is up to RC5 https://github.com/docker/docker/releases/tag/v1.12.0-rc5
1.10 was released Feb. 4 after four release candidates over a two week period, so has been in the wild for a number of months.

#12 Updated by Brett Smith about 5 years ago

Tom Morris wrote:

Do we need test coverage here?

Yes. arv keep docker registers Docker images in Arvados. It is critical to building workflows under both Crunch v1 and v2. No other tool does its job, so if it fails, users are up a creek.

Was test writing the reason it was estimated at a full story point?

My recollection is that it was pointed at one point primarily because of uncertainty. There was a possibility that additional code would be necessary to achieve full Docker 1.10+ compatibility. The conservative pointing was intended to help compensate for this.

Do we need to improve our process so that we test with Docker releases when they hit beta (or no later than immediately after they're released) or should we instead have a compatibility matrix which shows which versions we've tested with and are supported?

It would be an improvement. Docker's development pace vastly exceeds ours; in the less than two years we've integrated with it, there have been two breaking compatibility changes like this (the other was #4196).

#9428 would also help mitigate the risk of compatibility drift, by having this client tool talk to a specific version of the Docker API directly, rather than calling out to the Docker CLI.

Docker's such a critical component for the system that I think we need to have a good story here.

The current release is 1.11.2 and 1.12 is up to RC5 https://github.com/docker/docker/releases/tag/v1.12.0-rc5
1.10 was released Feb. 4 after four release candidates over a two week period, so has been in the wild for a number of months.

Part of why we've been dragging our feet is that the image file format change means that we have to devise and write a way to convert Docker images that are already registered in extant Arvados clusters, and so far there hasn't been demand for that. That's #8567.

Also available in: Atom PDF