Project

General

Profile

Actions

Bug #21721

closed

Review Python SDK dependencies

Added by Peter Amstutz 12 days ago. Updated 4 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
SDKs
Story points:
-
Release:
Release relationship:
Auto

Subtasks 1 (0 open1 closed)

Task #21726: Review 21721-refresh-python-depsResolvedBrett Smith04/25/2024Actions

Related issues

Related to Arvados - Idea #20311: Update Python packages to build with PEP 517/518NewActions
Has duplicate Arvados - Idea #21355: Modernize pycurl dependencyDuplicateBrett SmithActions
Actions #1

Updated by Peter Amstutz 12 days ago

  • Assigned To set to Brett Smith
Actions #2

Updated by Brett Smith 12 days ago

21721-refresh-python-deps @ 8a795bf668174147e9f254fa3af9b0be4a14a973 - developer-run-tests: #4187

This branch makes it so that none of our requirements in setup.py have a <= version pin.

  • Most of our pins were there not because we needed them directly, but we needed to maintain compatibility with other older things and pinning the version was the best way to make that happen under the old pip resolver. fab6a2603007e5ef60385148d09d51f3ed0f5b2c standardized our build process on the new pip resolver, so that's no longer a concern.
  • Others could be removed entirely because they weren't actually used or had stdlib replacements.
  • For other change rationales, see individual commit messages.
  • All agreed upon points are implemented / addressed.
    • Yes.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • We talked briefly about pinning some dependencies with ~=, but (a) we didn't commit to a review process for those, so I take it nobody's interested in taking on that cost, and (b) most of what we pinned wasn't our direct dependencies but indirect dependencies to force pip to do what we wanted. Given all this, I don't see much need to go forward with it.
  • Code is tested and passing, both automated and manual, what manual testing was done is described
    • See above. Also tested on my own system by manually upgrading individual packages in the test virtualenv and running tests to completion.
  • Documentation has been updated.
    • N/A
  • Behaves appropriately at the intended scale (describe intended scale).
    • No change in scale
  • Considered backwards and forwards compatibility issues between client and server.
    • I did not change any >= requirements, so anything we supported before is still supported after this branch.
  • Follows our coding standards and GUI style guidelines.
    • Yes
Actions #3

Updated by Peter Amstutz 11 days ago

  • Related to Idea #20311: Update Python packages to build with PEP 517/518 added
Actions #4

Updated by Peter Amstutz 11 days ago

LGTM with one question: looks like you removed from future.utils import viewvalues in one place but from future import ... shows up in other places, and we're still depending on the future package

Also there's several other packages (arvados-fuse comes to mind) that could also use some attention. I assume you wanted to avoid scope creep.

For general discussion:

On the topic of using ~= for versioning, my experience is there's no perfect approach, either you pin conservatively and potentially create problems with version conflicts or security issues, or don't pin and have some dude on the internet post a bad package and break your already-released software. The dependabot approach of automatically bumping up the upper version and running tests when a new dependency comes out is probably the least bad one, then our release will reflect a known good configuration.

It would make sense to revisit that once we have pyproject.toml implemented, because dependabot can updates those: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#pip-and-pip-compile

Actions #5

Updated by Brett Smith 10 days ago

To get to the general point first:

Also there's several other packages (arvados-fuse comes to mind) that could also use some attention. I assume you wanted to avoid scope creep.

More or less. We made this ticket in the context of talking about upgrading dependencies, either for security or maintenance. So I understood our goal was making sure that we could use the latest versions of all our dependencies. So, like I said, the branch puts us in a state where we get that, and no Python dependency has a version cap using a < operator. I looked at FUSE, but it has no dependencies like that, which is why I ended up not touching it. Same goes for for dockercleaner, crunchstat-summary, and user-activity.

With that as my goal, I removed version-capped dependencies that were straightforward to remove (which was most of it), and then uncapped everything else after digging through the history a bit to convince myself it has since become safe.

Everything that's left we're at least nominally using. Happy to reevaluate things but I do think they would make sense as separate tickets to talk about do we want to drop the functionality, or replace it with another library, or what.

LGTM with one question: looks like you removed from future.utils import viewvalues in one place but from future import ... shows up in other places, and we're still depending on the future package

Yeah, the <list|view><keys|values|items> functions are all straightforward wrappers around built-in dict methods, so those are very easy to clean and I did so where it was in front of me. But other things I don't know what they do off-hand, like future.standard_library.install_aliases(), and I didn't think they were in scope for this ticket since we're already on the latest version of future. We already have #21356 to remove that dependency.

For general discussion:

On the topic of using ~= for versioning, my experience is there's no perfect approach, either you pin conservatively and potentially create problems with version conflicts or security issues, or don't pin and have some dude on the internet post a bad package and break your already-released software.

For what it's worth I'd be fine with pin-and-dependabot too, but I'll point out, the historical record doesn't back up the idea this has been a major problem in our Python packages specifically. I thought it was too before I started working on this branch, but as I did I realized it wasn't true. The only cases I'm aware of where we actually pinned a direct dependency are:

  • We pinned google-api-python-client after 2.0 switched to static discovery documents by default. 2.1 restored backwards compatibility, so once that came out we could've switched the pin to !=2.0 and been fine.
  • We pinned msgpack after 1.0.4 had an installer bug. I haven't dug all the way through that history but assuming it got fixed, again, that could've been !=1.0.4.
  • We pinned docker because of a real API change.

Everything else we pinned was not because our direct dependency broke, but because we wanted to coerce pip to resolve dependencies a particular way—which will be much less of an issue going forward now that the new resolver is three years old. So, two buggy releases and one real API break over the span of the ten years we've had the Python SDK. For me personally, a problem that crops up less than once every three years doesn't seem high on our list of engineering priorities.

Actions #6

Updated by Brett Smith 10 days ago

  • Status changed from New to Resolved
Actions #7

Updated by Brett Smith 7 days ago

  • Has duplicate Idea #21355: Modernize pycurl dependency added
Actions #8

Updated by Peter Amstutz 4 days ago

  • Release set to 70
Actions

Also available in: Atom PDF