Story #9945

[SDK] Package Python apps as virtualenvs

Added by Peter Amstutz over 2 years ago. Updated about 11 hours ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
09/07/2016
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

Arvados tools and SDKs written in Python often require 3rd party packages that are not available as OS packages, or require a newer version than the OS package. Currently we package these as backports, but this a fairly high maintenance and (when upgrading existing OS packages) runs the risk of breaking the OS. Investigate the alternative of creating deb and rpm packages which use a Python virtualenv to isolate the package dependencies.

On brief survey I've found a couple of tools for doing this:

https://github.com/spotify/dh-virtualenv

https://github.com/kevinconway/rpmvenv

fpm also has support?

https://github.com/jordansissel/fpm/issues/697

https://github.com/jordansissel/fpm/pull/930


Subtasks

Task #9963: bring our fpm fork in line with latest headNewWard Vandewege


Related issues

Related to Arvados - Bug #9944: [CWL] python-lockfile version conflictResolved

Related to Arvados - Bug #14326: Our custom-compiled `python-future` and `python3-future` packages can't be installed together and have precedenceNew

History

#1 Updated by Peter Amstutz over 2 years ago

  • Description updated (diff)

#2 Updated by Peter Amstutz over 2 years ago

  • Description updated (diff)

#3 Updated by Peter Amstutz over 2 years ago

  • Description updated (diff)

#4 Updated by Peter Amstutz over 2 years ago

root@e9972d7acb01:~# cat > after-inst <<EOF
> #!/bin/sh
> cd /usr/bin
> ln -s /usr/share/python/arvados-cwl-runner/bin/arvados-cwl-runner
> ln -s /usr/share/python/arvados-cwl-runner/bin/arvados-cwl-runner cwl-runner
> EOF
root@e9972d7acb01:~# fpm --after-install=after-inst -s virtualenv -t deb -n arvados-cwl-runner arvados-cwl-runner 

#5 Updated by Peter Amstutz over 2 years ago

Related, using fpm+gem to produce self-contained Ruby packages: http://blog.gemnasium.com/post/60091718748/package-your-ruby-based-tool-the-safe-and-easy-way

#6 Updated by Tom Morris over 1 year ago

  • Target version set to Arvados Future Sprints

#7 Updated by Ward Vandewege about 1 month ago

  • Status changed from New to In Progress
  • Assigned To set to Ward Vandewege
  • Target version changed from Arvados Future Sprints to 2018-12-21 Sprint

#8 Updated by Tom Morris 21 days ago

  • Target version changed from 2018-12-21 Sprint to 2019-01-16 Sprint

#9 Updated by Ward Vandewege 21 days ago

  • Related to Bug #14326: Our custom-compiled `python-future` and `python3-future` packages can't be installed together and have precedence added

#10 Updated by Tom Morris 7 days ago

  • Target version changed from 2019-01-16 Sprint to 2019-01-30 Sprint

#11 Updated by Ward Vandewege 1 day ago

Ready for (initial?) review at 5a09ff5d42b0f8b71ca6775813e0844067363e12

This still builds our backported libcloud package.

#12 Updated by Tom Clegg about 11 hours ago

I'm not sure why this

set -e
arvados-node-manager --version
set +e

PYTHON=`ls /usr/share/python2.7/dist/arvados-node-manager/bin/python2.7`

if [ "$PYTHON" != "" ]; then
  set -e
  exec $PYTHON <<EOF
...
else
  exit 1
fi

...isn't it equivalent to just using an explicit python binary like this?

set -e
arvados-node-manager --version
exec /usr/share/python2.7/dist/arvados-node-manager/bin/python2.7 <<EOF
...
Of course, the whole idea of an "import" test seems a bit weird here. Mostly this is scope creep, but I'll observe anyway...
  • In the arvados-node-manager case, it tests some libcloud thing that (I'm guessing) isn't tested by "arvados-node-manager --version", which seems fine.
  • In the arvados-cwl-runner case, I'm not sure what the inline script tells us that `arvados-cwl-runner --version` doesn't.
  • In the arvados-fuse case, it looks like we test a script that imports arvados_fuse (which nobody does, afaik) but we don't try running "arv-mount --version", which seems more relevant. This seems like a pre-existing bug, but now might be an appropriate time to fix this one, since "run arv-mount from PATH and expect it to import all its things, including llfuse C libs" is the sort of thing we could be breaking here.
  • Likewise, in arvados-python-client, we should add something like "arv-put --help".

In fpm_build_virtualenv...

The "*" should be quoted or escaped here:

+  find build -iname *.pyc -exec rm {} \;
+  find build -iname *.pyo -exec rm {} \;

There are some explicit checks for $? so I'm suspecting "set -e" isn't in force in fpm_build_virtualenv, so there are some unchecked errors (including the above "find" commands, "pip install pip", and "cp -f"). I suppose we can't do "set -e" somewhere without accidentally making global changes and having to rewrite Everything...

Suggest

-if [[ "${DEBUG:-0}" != "0" ]]; then ...
+if [[ -n "${DEBUG}" ]]; then ...

Some typos "Arvadow", "execurin", "sectiontoo"

What's the story behind changing the version discovery? It used to be passed in from run-build-packages ("ARVADOS_BUILDING_VERSION if given, else extract from egg PKG-INFO / ARVADOS_BUILDING_ITERATION if given, else depends on package"); now, run-library runs "setup.py sdist" and parses the resulting tarball filename to get VERSION, and defaults to iteration "-1". (I think setting iteration to -1 is a good move, except that if we want this merge to produce a set of dev packages, we might need to either set iteration to -5 or update the last-change-detection hairball to notice changes in /build. Ugh)

I think source:services/dockercleaner/arvados-docker-cleaner.service needs to be updated: python33→python35, and fix the hardcoded bin path used to detect scl:

ExecStart=/bin/sh -c 'if [ -e /opt/rh/python33/root/bin/arvados-docker-cleaner ]; then exec scl enable python33 arvados-docker-cleaner; else exec arvados-docker-cleaner; fi'

I'm wary of the way the prerm/postinst scripts and the fpm_build function are getting forked here.
  • The new prerm/postinst scripts duplicate some stuff from the old ones, and don't seem to enable the systemd units, or check whether $1 indicates the appropriate hook (the old ones do this, so I'm guessing it's needed despite the names of the fpm args)
  • The new prerm/postinst scripts get autogenerated seems messy so I'm wondering if there's another way. Could it list files in /usr/share/python*/dist/$pkg/bin/ at runtime? Or are there some fpm args we can use to install our own bin stubs/symlinks as /usr/bin/whatever? Even if we still need to autogenerate the stubs, at least that way yum/apt would know where they came from, avoid installing other packages with conflicting binaries, etc.
  • The new fpm_build_virtualenv has a lot of overlap with fpm_build, but does some things differently (e.g., it calls test_package_presence instead of letting the caller do that). Meanwhile the old fpm_build still has lots of (now unused?) python-specific stuff. I don't know which is less messy (two similar funcs vs. lots of if-else). Splitting might be better but it would be good to keep them congruent.

Nit/style thing: I find it easier to follow "if ! foo; then ...; fi" than "foo; if $? != 0 ; then ...; fi". The first form also has an advantage that it works the same way in "set -e" and "set +e" contexts.

The deletion of build.list (and its version compatibility maintenance nightmare) sure makes me happy... this is very encouraging. Thanks!

Also available in: Atom PDF