Bug #6638

[Deployment] Python package backports should declare their C dependencies

Added by Joshua Randall about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Brett Smith
Category:
Deployment
Target version:
Start date:
07/15/2015
Due date:
% Done:

100%

Estimated time:
(Total: 2.00 h)
Story points:
1.0

Description

Original bug report

If installing on a bare bones debian or ubuntu machine, the python-arvados-python-client package does not bring in the dependency for libcurl-gnutls.so.4, which is provided by libcurl3-gnutls.

I have confirmed this is an issue for both debian:wheezy and ubuntu:precise (using the base docker images for each).

The root problem

The Python package backports we build don't declare their C dependencies. We should fix this problem by declaring those dependencies there:

  • These dependencies vary by distro: CentOS builds PyCURL against NSS rather than GnuTLS
  • The package will integrate better with the rest of the surrounding system
  • Users will still get the desired behavior that necessary dependencies are pulled in automatically by the package manager

Known dependencies to declare:

  • PyCURL likely depends on libcurl and an SSL library (GnuTLS for Debian, NSS for Red Hat)
  • llfuse depends on FUSE libraries and libxattr.
  • Any others? Build our packages in a new virtualenv and watch what C extensions get built during the process.
    • PyYAML builds C, but it's not an installation dependency. It's only needed to run tests.

This is going to be non-trivial because right now the package builder builds these packages in a big for loop, with no variation, so that variation will have to be added. Likely the simplest way to do that would be to add a case statement inside the loop that creates an array of additional fpm arguments based on the package's name, then passes those arguments when it calls fpm_build.


Subtasks

Task #7052: Review 6638-python-backport-dependencies-wipResolvedPeter Amstutz

Task #6911: Update build scripts to add distro-specific dependencies for backported packagesResolvedTom Clegg

Task #6949: Review 6638-backport-deps (arvados + arvados-dev)ResolvedPeter Amstutz

Task #6912: Determine dependencies for various distrosResolvedBrett Smith


Related issues

Related to Arvados - Bug #6934: [Authentication] Write tests for pam module and shellinabox configResolved08/07/2015

Copied to Arvados - Bug #7184: [Deployment] Test distribution packagesNew

Associated revisions

Revision 81b4b709
Added by Tom Clegg about 4 years ago

Merge branch '6638-backport-deps' refs #6638

Revision 81b4b709
Added by Tom Clegg about 4 years ago

Merge branch '6638-backport-deps' refs #6638

Revision 1f0466a3
Added by Tom Clegg about 4 years ago

Merge branch '6638-backport-deps' refs #6638

Revision 73aca60f (diff)
Added by Tom Clegg about 4 years ago

Accept libcurl4-openssl-dev as an alternative to libcurl4-gnutls-dev dependency. refs #6638

Revision 1e70b294 (diff)
Added by Brett Smith almost 4 years ago

6638: Python backports declare all their C dependencies.

See #6638 for discussion about how these dependency lists were
generated.

Revision e88a8ced (diff)
Added by Brett Smith almost 4 years ago

6638: Python backports declare all their C dependencies.

See #6638 for discussion about how these dependency lists were
generated.

Revision 7ca887a1
Added by Brett Smith almost 4 years ago

Merge branch '6638-python-backport-dependencies-wip'

Closes #6638.

Revision c271c93c (diff)
Added by Brett Smith almost 4 years ago

6638/7370: Force new builds of Python backports with dependencies.

Even though we've declared these dependencies for a while now, Jenkins
has not published packages with them, because without a new upstream
version, fpm believes that there's no new package to build. This
resolves that by building a new iteration of the affected packages.

This is less than ideal, because if a new version is released, we'll
automatically package it with iteration 2. That is not correct, but
it doesn't affect any functionality, and we already have a plan to do
things properly in #6885. So we'll live with "correct functionality,
gross aesthetics" until then.

Ward approved in conversation. Refs #6638, #7370.

History

#1 Updated by Peter Amstutz about 4 years ago

  • Target version set to Bug Triage

#2 Updated by Brett Smith about 4 years ago

  • Subject changed from python-arvados-python-client package should depend on libcurl3-gnutls to [Deployment] Python package backports should declare their C dependencies
  • Description updated (diff)
  • Category set to Deployment
  • Target version changed from Bug Triage to 2015-08-19 sprint
  • Story points set to 1.0

#3 Updated by Brett Smith about 4 years ago

  • Description updated (diff)

#4 Updated by Brett Smith about 4 years ago

  • Description updated (diff)

#5 Updated by Nico César about 4 years ago

fpm has a lot of options, and we're starting to use more and more.

we should have a arvados/services/:service:/fpm_build.sh that has all the needed options inside the script.

this will be called from arvados-dev/jenkins/run-build-packages.sh

#6 Updated by Ward Vandewege about 4 years ago

Suggest we get the list of dependencies to add from the existing packages in Debian and CentOS. For most of our python backports, these should exist - we just backport to get newer versions.

#7 Updated by Tom Clegg about 4 years ago

  • Assigned To set to Tom Clegg

#8 Updated by Tom Clegg about 4 years ago

  • Status changed from New to In Progress

#9 Updated by Tom Clegg about 4 years ago

debian7 has a bit of a hiccup:

$ docker run -it -v {...}/arvados/packages/debian7:/pkg:ro debian:7
# apt-get update; dpkg -i /pkg/python*.deb; apt-get -f install
...
dpkg: error processing udev (--configure):
 subprocess installed post-installation script returned error exit status 2
dpkg: dependency problems prevent configuration of fuse:
 fuse depends on udev | makedev; however:
  Package udev is not configured yet.
  Package makedev is not installed.

dpkg: error processing fuse (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of python-llfuse:
 python-llfuse depends on fuse; however:
  Package fuse is not configured yet.

dpkg: error processing python-llfuse (--configure):
 dependency problems - leaving unconfigured
Setting up python-minimal (2.7.3-4+deb7u1) ...
Setting up python (2.7.3-4+deb7u1) ...
Setting up python-support (1.0.15) ...
Setting up python-lockfile (1:0.8-2) ...
Setting up python-daemon (1.5.5-1) ...
dpkg: dependency problems prevent configuration of python-arvados-fuse:
 python-arvados-fuse depends on python-llfuse (>= 0.40); however:
  Package python-llfuse is not configured yet.

dpkg: error processing python-arvados-fuse (--configure):
 dependency problems - leaving unconfigured
...
Errors were encountered while processing:
 udev
 fuse
 python-llfuse
 python-arvados-fuse
E: Sub-process /usr/bin/dpkg returned an error code (1)

Another apt-get -f install makes everything OK.

#10 Updated by Tom Clegg about 4 years ago

Branches:

I'm not certain all of the dependencies are listed, but the known deps are covered (fuse and libyaml) and I don't think the possibility of finding more should hold up merging the specific dependency problems that are fixed here, or the framework that allows us to deal with more.

These branches include some incidental fixes in arvados-dev and arvados/sdk/pam/ that fix bugs/annoyances I encountered along the way.

#11 Updated by Tom Clegg about 4 years ago

I didn't run into libattr troubles so it didn't get listed as a dependency. It seems to be included in the docker images for all five supported targets. I'm guessing it's mentioned here only because its corresponding -dev package is not included in base docker images, and is needed to build the llfuse package. In that case, I'd suggest both of these should happen (but shouldn't necessarily block merging the current branches):
  1. List libattr as a dependency. A dependency that's included in the most minimal "base" image we test with is still a dependency.
  2. List libattr-dev as a build dependency. fpm-info.sh seems like a good place to mention this (as I did for libpq-dev) even though nothing pays attention to it yet. It seems logical to me that -- just like runtime dependencies -- each package should mention its own build dependencies, and the build scripts should merge them all into a "can install everything" builder image if that's the way it wants to do things. Currently we only publish/remember "you need all these deps if you want to build all the arvados things".

#12 Updated by Brett Smith about 4 years ago

  • Description updated (diff)

Tom Clegg wrote:

I didn't run into libattr troubles so it didn't get listed as a dependency. It seems to be included in the docker images for all five supported targets.

On Debian, at least, it's a Depends: of coreutils, so… there's that.

I'm guessing it's mentioned here only because its corresponding -dev package is not included in base docker images, and is needed to build the llfuse package.

More or less. I figured/assumed needing the -dev package to build meant the binary would require the corresponding library. In retrospect, obviously that was pretty faulty logic. The end result is that the description is buggy.

  1. List libattr as a dependency. A dependency that's included in the most minimal "base" image we test with is still a dependency.

If we want to be as much like the distros as possible, we should follow their packaging rules. And the Debian Policy Manual indicates that you should list all dependencies, with no exception for "base" packages (which in Debian's case means the package has Essential: yes). So +1 to the rationale.

But in this specific case, libattr doesn't seem to be a dependency of the built package: ldd on the built .so does not list libattr as linked. So no need for it to be listed at all.

  1. List libattr-dev as a build dependency.

This is a fine idea, but build dependencies go on source packages (as opposed to binary packages), which we don't build at this time. So I feel like it's very out of scope for this issue.

#13 Updated by Peter Amstutz about 4 years ago

libattr-dev is needed for xattr.h which is needed by libfuse-dev which is needed to build the native code portion of llfuse.

#14 Updated by Peter Amstutz about 4 years ago

centos6:

$ rpm -i python27-python-*.rpm
python27-python-setuptools is needed by python27-python-daemon-2.0.5-1.noarch
python27-python-docutils is needed by python27-python-daemon-2.0.5-1.noarch
python27-python-distribute is needed by python27-python-pyvcf-0.6.7-1.noarch

#15 Updated by Brett Smith about 4 years ago

A few notes on what's been merged so far:

Right now the Python SDK depends on a couple of -dev packages. There are a couple of issues that make me think this isn't quite right:

  • -dev packages include the headers and tools necessary to build a binary that's linked to a particular library. The contents of these packages are very rarely used when the binary runs, and I can't imagine a situation where that would be true for our own packages. As long as that's right, they shouldn't be declared as Depends: of the package.
  • Since our Python SDK doesn't include any C, there should be no need to declare separate dependencies for it anyway. fpm figures out Python dependencies just fine. Per the story, these dependencies should exist on the packages that have binary code.

Were these supposed to be build_depends, maybe?

Beyond that: the list of dependencies for PyCURL is incomplete. I don't know if this is 100% accurate, but here's the best way I know to figure out the dependencies of a binary after the fact:

% ldd FILENAME | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u

Running that on the pycurl.so we build on wheezy gets me:

libc6
libcomerr2
libcurl3-gnutls
libgcrypt11
libgnutls26
libgpg-error0
libgssapi-krb5-2
libidn11
libk5crypto3
libkeyutils1
libkrb5-3
libkrb5support0
libldap-2.4-2
libp11-kit0
librtmp0
libsasl2-2
libssh2-1
libtasn1-3
zlib1g

I would expect the list of Depends: for python-pycurl to look close to that.

#16 Updated by Tom Clegg about 4 years ago

Another bug: the deps and build-deps can't just say "openssl | gnutls" because the runtime dependency depends on which -dev library the package was built with.

AFAICT this means python-pycurl-openssl and python-pycurl-gnutls are two different backport packages. We can offer one or the other, or both. Offering both will be annoying because we can't build both with the same container (at least without additional futzing). Surely easiest to pick one, and call it python-pycurl.

On the plus side, the runtime packages aren't mutually exclusive like the -dev packages are. So if we pick one, the existence of the other runtime on the target system shouldn't make our package uninstallable.

I suppose we prefer gnutls?

#17 Updated by Brett Smith about 4 years ago

Tom Clegg wrote:

Another bug: the deps and build-deps can't just say "openssl | gnutls" because the runtime dependency depends on which -dev library the package was built with.

AFAICT this means python-pycurl-openssl and python-pycurl-gnutls are two different backport packages. We can offer one or the other, or both. Offering both will be annoying because we can't build both with the same container (at least without additional futzing). Surely easiest to pick one, and call it python-pycurl.

Yes, we should do that.

On the plus side, the runtime packages aren't mutually exclusive like the -dev packages are. So if we pick one, the existence of the other runtime on the target system shouldn't make our package uninstallable.

Correct.

I suppose we prefer gnutls?

I think it would be best if we can prefer whatever the default libcurl package on the underlying distro used. This may create even more differences in dependencies across distros. Not sure that's avoidable in the long run anyway.

#18 Updated by Brett Smith about 4 years ago

  • Assigned To deleted (Tom Clegg)
  • Target version changed from 2015-08-19 sprint to 2015-09-02 sprint

#19 Updated by Brett Smith about 4 years ago

  • Assigned To set to Brett Smith

#20 Updated by Brett Smith almost 4 years ago

6638-python-backport-dependencies-wip is up for review. I built all the packages and confirmed you could install them all on CentOS with a simple yum install python27-python*.rpm in a bare Docker container, after installing the python27 Software Collection.

Methodology:

  1. Conveniently, the Python backports that have compiled code in them are clearly marked by filename: they have the architecture as x86_64/amd64 instead of noarch/all.
  2. For each distro we support, grab the latest version of each backport that includes compiled code.
  3. Extract all the .so files from those packages.
  4. For each distro we support, start our Docker build image with these .so files mounted under /mnt.
  5. In that container, Run the command in note-15 to generate the list of dependencies for each .so. (The RPM version is: ldd FILENAME | awk '($3 ~ /^\//){print $3}' | sort -u | xargs rpm -qf | sort -u)
  6. Add that list to fpm-info.sh for the backport.

Results

centos6

[root@a5decf669d5f mnt]# ldd pycurl.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs rpm -q
f | sort -u
cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64
glibc-2.12-1.166.el6_7.1.x86_64
keyutils-libs-1.4-5.el6.x86_64
krb5-libs-1.10.3-42.el6.x86_64
libcom_err-1.41.12-22.el6.x86_64
libcurl-7.19.7-46.el6.x86_64
libidn-1.18-2.el6.x86_64
libselinux-2.0.94-5.8.el6.x86_64
libssh2-1.4.2-1.el6_6.1.x86_64
nspr-4.10.8-1.el6_6.x86_64
nss-3.19.1-3.el6_6.x86_64
nss-softokn-freebl-3.14.3-22.el6_6.x86_64
nss-util-3.19.1-1.el6_6.x86_64
openldap-2.4.39-8.el6.x86_64
openssl-1.0.1e-42.el6.x86_64
zlib-1.2.3-29.el6.x86_64

[root@a5decf669d5f mnt]# ldd capi.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs rpm -qf
| sort -u
fuse-libs-2.8.3-4.el6.x86_64
glibc-2.12-1.166.el6_7.1.x86_64

[root@a5decf669d5f mnt]# ldd ciso8601.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs rpm -qf | sort -u
glibc-2.12-1.166.el6_7.1.x86_64

[root@a5decf669d5f mnt]# ldd _*.so strxor.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs rpm -qf | sort -u
glibc-2.12-1.166.el6_7.1.x86_64

debian7

root@48b0b40d8efc:/mnt# ldd pycurl.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libcomerr2
libcurl3-gnutls
libgcrypt11
libgnutls26
libgpg-error0
libgssapi-krb5-2
libidn11
libk5crypto3
libkeyutils1
libkrb5-3
libkrb5support0
libldap-2.4-2
libp11-kit0
librtmp0
libsasl2-2
libssh2-1
libtasn1-3
zlib1g

root@48b0b40d8efc:/mnt# ldd capi.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libfuse2

root@48b0b40d8efc:/mnt# ldd ciso8601.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

root@48b0b40d8efc:/mnt# ldd _*.so strxor.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

debian8

root@79fcc3787ad1:/mnt# ldd pycurl.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libcomerr2
libcurl3-gnutls
libffi6
libgcrypt20
libgmp10
libgnutls-deb0-28
libgpg-error0
libgssapi-krb5-2
libhogweed2
libidn11
libk5crypto3
libkeyutils1
libkrb5-3
libkrb5support0
libldap-2.4-2
libnettle4
libp11-kit0
librtmp1
libsasl2-2
libssh2-1
libtasn1-6
zlib1g

root@79fcc3787ad1:/mnt# ldd capi.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libfuse2

root@79fcc3787ad1:/mnt# ldd ciso8601.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

root@79fcc3787ad1:/mnt# ldd _*.so strxor.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libgmp10

ubuntu1204

root@6d1dbd80462e:/mnt# ldd pycurl.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libasn1-8-heimdal
libc6
libcomerr2
libcurl3-gnutls
libgcrypt11
libgnutls26
libgpg-error0
libgssapi-krb5-2
libgssapi3-heimdal
libhcrypto4-heimdal
libheimbase1-heimdal
libheimntlm0-heimdal
libhx509-5-heimdal
libidn11
libk5crypto3
libkeyutils1
libkrb5-26-heimdal
libkrb5-3
libkrb5support0
libldap-2.4-2
libp11-kit0
libroken18-heimdal
librtmp0
libsasl2-2
libsqlite3-0
libtasn1-3
libwind0-heimdal
zlib1g

root@6d1dbd80462e:/mnt# ldd capi.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libfuse2

root@6d1dbd80462e:/mnt# ldd ciso8601.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

root@6d1dbd80462e:/mnt# ldd _*.so strxor.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

ubuntu1404

root@05d80ee46843:/mnt# ldd pycurl.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S
 | cut -d: -f1 | sort -u
libasn1-8-heimdal
libc6
libcomerr2
libcurl3-gnutls
libffi6
libgcrypt11
libgnutls26
libgpg-error0
libgssapi-krb5-2
libgssapi3-heimdal
libhcrypto4-heimdal
libheimbase1-heimdal
libheimntlm0-heimdal
libhx509-5-heimdal
libidn11
libk5crypto3
libkeyutils1
libkrb5-26-heimdal
libkrb5-3
libkrb5support0
libldap-2.4-2
libp11-kit0
libroken18-heimdal
librtmp0
libsasl2-2
libsqlite3-0
libtasn1-6
libwind0-heimdal
zlib1g

root@05d80ee46843:/mnt# ldd capi.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6
libfuse2

root@05d80ee46843:/mnt# ldd ciso8601.so | awk '($3 ~ /^\//){print $3}' | sort -u | xargs dpkg -S | cut -d: -f1 | sort -u
libc6

[We don't backport pycrypto here.  Ubuntu's version is good enough.]

#21 Updated by Peter Amstutz almost 4 years ago

I have pushed a branch package-install-testing to arvados-dev which automates package install testing for the Python packages by installing into a base images for each supported distro. Using this, I can confirm that the dependency fixes in 6638-python-backport-dependencies-wip result in working packages.

I would like to propose the scripts in package-install-testing be reviewed and considered for a new Jenkins step.

#22 Updated by Nico César almost 4 years ago

Peter Amstutz wrote:

I have pushed a branch package-install-testing to arvados-dev which automates package install testing for the Python packages by installing into a base images for each supported distro. Using this, I can confirm that the dependency fixes in 6638-python-backport-dependencies-wip result in working packages.

I would like to propose the scripts in package-install-testing be reviewed and considered for a new Jenkins step.

I'm reviewing 21a3d01379891f2670991e4d24804e1dc87a1ab1

  1. where is common-test-packages.sh supposed to run? doing a find . -name "*.so" could end up with a lot go false positive. Can we mitigate this problem somehow?
  2. it's unclear from the scripts how -- should be used
  3. code like:
    if ! $pkg --run-test ; then
        FAIL=1
        ERRORS="$ERRORS\n$pkg has install errors" 
    

    should be migrated to getops version, just like we have on the other scripts in jenkins dir
  4. add usage() too.
  5. it looks to me that if I run deb-common-test-packages.sh with no parameters I end up with 'echo "deb file:///mnt /" >>/etc/apt/sources.list ' ... is that true?
  6. I see $WORKSPACE/packages as the expected place to have all packages. This is not a problem if we execute these scripts inside the Current Jenkins Job. but if we want to split up the current job into 3: build, test and upload... this WILL be a problem. I'm creating a story for the split itself.

#23 Updated by Brett Smith almost 4 years ago

To keep this ticket from getting too sidetracked, I've created https://arvados.org/issues/7184 for further discussion of package testing. Per note-21, I'm going to merge my branch and close this, since the reported issue is resolved.

#24 Updated by Brett Smith almost 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:7ca887a10c55b7fe9400bd3c536e721115a28a6e.

Also available in: Atom PDF